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VACCINAL POLYPEPTIDES 

Field of the Invention 
5 The present invention relates generally to polypeptides useful in 

vaccine compositions and more specifically to vaccine compositions useful in 
providing immunity against influenza A and influenza B in an animal. The present 
invention also relates generally to a method of enhancing expression of polypeptides 
and, more spedfically, to a method of enhancing influenza protein expression and 
1 0 homogeneity in E. colL 

^apkg]rQ^ng| Qlf thp InvffltiiQn 

Influenza virus infection causes acute respiratory disease in man, 
horses, swine and fowl, sometimes of pandemic proportions. Influenza viruses are 
15 orthomyxoviruses and, as such, have envelope virions of 80 to 120 nanometers in 
diameter, with two different glycoprotein spikes. Three types. A, B and C, infect 
humans. Type' A viruses have been responsible for the majority of human 
epidemics in modem history, although there are also sporadic outbreaks of Type B 
infections. Known swine, equine, and avian viruses have mostly been Type A, 

2 0 although Type C viruses have also been isolated from swine. 

The Type A viruses are divided into subtypes based on the antigenic 
properties of the hemagglutinin (HA) and neuraminidase (NA) surface 
glycoproteins. Within Type A, subtypes HI ("swine flu"), H2 ("asian flu"), and H3 
("Hong Kong flu") are predominant in human infections. In swine, the predominant 
25 influenza A subtypes are HI and H3; in horses, H3 and H7; and in avians, H5 and 
H7. Presently only one Type B virus has been identified, with no subtypes. 

Genetic "drift" or "shift", i.e., rapid and unpredictable change in the 
antigen, occurs at approximately yearly intervals, and affects antigenic determinants 
in the HA and NA proteins. Therefore, it has not been possible to prepare a 

3 0 "universal" influenza virus vaccine using conventional killed or attenuated viruses, 

that is, a vaccine which is non-strain specific. Recently, attempts have been made to 
prepare such universal, or senri-universal, vaccines from reassonant viruses 
prepared by crossing different strains. More recently, such attempts have involved 
recombinant DNA techniques focusing primarily on the HA protein. 
35 There remains a need in the art for vaccine formulations and 

compositions capable of inducing protective responses in animals against influenza 
viruses. 
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The expression of recombinant proteins in bacterial systems, 
particularly £. coli, is highly desirable because it can be used to produce large 
amounts of the desired proteins relatively inexpensively. However, high level 
expression of several eukaryotic proteins in E, coli has not been achieved for 
5 reasons including, among others, unfavorable codon usage and toxicity of the gene 
product [U. Brinkmann et al. Gene . £5:109-114 (1989)]. Methods of overcoming 
these impediments to high-level expression in bacteria have been (described, but are 
not universally applicable. 

For example, Brinkmann etaL, cited above, described low-level 

1 0 expression of certain genes, such as human tissue-type plasminogen activator or 
gp41 of human immunodeficiency virus^ which the authors attributed to the 
presence of the rare triplets AGA and AGO which encode arginine (Arg) in 
unexpectedly high amounts in the gene (3.2%). However, other eukaryotic genes, 
such as the NSl gene of influenza virus, contain greater than 3% of such triplets yet 

1 5 express at high levels in £. coli [Young et aU Proc. Natl. Acad. Sci.. fiQ:6105-6109 
(1983)1. 

Another group, Spanjaard et aL, Nucl. Acids Res. . ia(17):503 1-5036 
(1990) describe a translation shift in about 50% of ribosomes after tandem (double) 
AGA and AGG codons in cloned tRNA genes, but observed no frame shifts 
20 following single AGG or AGA codons. The authors attribute this frame shift to 
tRNA depletion. There also remains a need in the art for improved methods of 
producing vaccinal polypeptides capable of inducing protective responses in animals 
against influenza viruses. 

25 Summary of the Invention 

The present invention provides compositions containing, and 
methods for use of a protein which is capable of inducing protection in animals and 
avians against challenge with more than one strain of influenza Type A and 
influenza Type B. 

30 Thus, one aspect of the invention provides a DNA sequence encoding 

a modified purified recombinant protein. The DNA sequence of the invention 
encodes a modified protein sequence derived from the HA2 subunit of a selected 
hemagglutinin (HA) protein. In one embodiment, the sequence is derived from an 
H3N2 subtype influenza virus. These H3N2 fusion proteins are capable of inducing 

35 T cell responses in the absence of neutralizing antibodies. In another embodiment, a 
DNA sequence of this invention encodes a modified protein sequence derived from 
the HA2 subunit from a Type B influenza virus. Still further embodiments include 
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DNA sequences obtained as described for the two above viruses, where the 
sequences are derived from other Type A influenza strains infecting animals as well 
as humans. Such viruses include, without limitation, Type A subtypes of HI, H2, 
H3,H4,H5,H6andH7. 
5 In another aspect, the invention provides a DNA sequence encoding a 

recombinant fusion protein, in which die desired Type A subtype HA2 subunit 
sequence or a portion thereof, is fused in frame to another protein or protein 
fragment capable of enhancing expression of the fusion protein. One embodiment 
includes the H3N2 subtype HA2 subunit sequence described above fused in frame to 

1 0 another protein or fragment capable of enhancing expression thereof. Another 

embodiment of such a fusion protein comprises a Type B HA2 sequence, described 
above, or a pordon thereof, fused in frame to another protein or protein fragment 
capable of enhancing expression of the ftision protein. Additionally, other Type A 
subtype HA2 sequences can be similarly used. It is desirable that this fusion partner 

1 5 protein be an influenza protein sequence or fragment thereof. 

In still another aspect, a protein encoded by a DNA sequence of the 
invention is provided. The protein may be a protein sequence derived from the HA2 
subunit of an HA protein from a selected Type A subtype virus. Desirably the 
subtype virus is an H3N2. In another embodiment, the protein may be derived from 

20 the HA subunit of a Type B influenza virus. Other embodiments include HS or H7 
subtypes. Additionally, preferred embodiments include fusion proteins comprising 
a protein sequence derived from the HA2 subunit of an HA protein from a Type A 
virus, e.g., an H3N2 subtype, or from a Type B virus fused in frame to a selected 
influenza sequence. The proteins of this invention are particularly useful in inducing 

25 protection in mammals, especially humans, against challenge by Type B or an H3N2 
subtype of influenza A. The proteins employing other Type A subtypes, e.g., H5 
and H7, are useful in inducing protection in animals against influenza viruses. 

In another aspect, the invention provides a method of recombinantiy 
producing the fusion proteins of the invention, and a method of purifying the same. 

30 In a further aspect, the invention provides a vaccine composition 

containing a purified protein of the invention, as described above. Such a vaccine 
composition may include a fusion protein of die invention. In other embodiments of 
the invention, the vaccine compositions contain an H3HA2 protein of the invention 
and other influenza antigens; a Type B HA2 protein of the invention and other 

35 influenza antigens; or both an H3HA2 protein, a BHA2 protein and other influenza 
antigens. In a preferred embodiment for human use, a combination vaccine of the 
invention will contain an H3HA2 and a BH A2 protein of the invention in 
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combination with influenza antigens derived from the other Type A influenza virus 
subtypes, HI and H2. An embodiment for use in animals may contain an H5HA2 or 
H7HA2 protein, among others. 

A further aspect of this invention is a method for inducing in an 
5 animal protection against influenza Type A, influenza Type B, influenza Type C, or 
combinations thereof, which comprises internally administering to the animal an 
effective immunogenic amount of a vaccine composition of the present invention. 

Still a further aspect of this invention is a method for inducing in an 
animal protection against multiple strains of influenza Types A and B which 
1 0 comprises internally administering to the animal an effective immunogenic amount 
of a vaccine composition of the present invention. 

In another aspect, the present invention provides a method of 
enhancing in £. coli the expression of influenza vaccinal proteins characterized by a 
naturally-occurring amino acid pattern comprising Arg-Arg-Xaa-Xaa-Arg [SEQ ID 
15 N0:8]. In this pattern, Arg is arginine, Xaa is any amino acid, and at least one of 
the arginines in the naturally-occurring sequence is encoded by the rare nucleic acid 
triplets AGG or AGA, 

In one embodiment, the method of the invention involves mutating 
one or noorc of these AGG or AGA codons to a prefeired argininc codon and 

2 0 ^pressing the mutated sequence in £. coIL Surprisingly, it has been found tiiat this 

modification, which does not result in a change in the encoded amino acid sequence, 
can increase the expression and homogeneity of an influenza protein in £. coli 
signiflcantiy. 

In another embodiment, the method of this invention involves 
25 increasing the expression of the above-identified proteins by inserting into the host 
cell tRNA molecules capable of translating tiie native rare arginine codons. Thus, 
the E. coli host cells are modified such that tiiey are capable of efficienfly 
translating the rare, native arginine codons. 

In another aspect, the present invention provides novel nucleic acid 

3 0 sequences of influenza proteins which contain the nucleotide sequence CGn-CGn- 

Xaa-Xaa-CGn, where n represents a nucleotide selected from the group consisting 
of T, C, A or G [SEQ ID N0:9], in place of the native nucleotide sequence AGr- 
AGr-Xaa-Xaa-AGr, where r represents the nucleotides A or G [SEQ ID NO: 10]. 
When expressed in £. coli, these sequences result in increased expression of the 
3 5 encoded protein as compared to the native sequence. 

In still another aspect, the invention provides the novel modified 
nucleic acid sequences described above fused in the same reading frame to another 
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DNA sequence encoding a polypeptide or protein, i,e., a fusion partner, which may 
further enhance the expression of, or immunogenicity of, the encoded influenza 
protein. It is desirable that the fusion partner be an influenza protein sequence or 
fragment thereof. 

5 Other aspects and advantages of the present invention are described 

further in the following detailed description of the preferred embodiments thereof. 

Brief Description of the Drawings 

Fig. 1 illustrates the nucleic acid sequences of the HA2 portions of 

1 0 (a) AAJdom [SEQ ID NO: 1], (b) A/Victoria [SEQ ID NO: 3], (c) A/PR/8/34 [SEQ 
ID NO: 5], and (d) a consensus sequence [SEQ ID NO: 7]. Dashes indicate the 
same nucleotide as tiie consensus sequence. Different nucleotides from that of the 
consensus sequence are reported in lower case letters. Etots indicate no 
corresponding nucleotide when compared to the consensus sequence. 

1 5 Fig. 2 illustrates the nucleic acid and amino acid sequences of 

H3C13, NSl(i.81)H3HA2(i.221) fusion protein [SEQ ID NO: 9 & 10], widi the 
mutant nucleic acid sequences of H3C13mut5855 [SEQ ID NO: 58] illustrated . 
above the sequence of the unmodified H3HA2 portion. 

Fig. 3 illustrates the nucleic acid and amino acid sequences of the 

20 NSl(i-81)H3HA2(77-221) fusion protein [SEQ ID NO: 11 & 12]. 

Fig. 4 illustrates the nucleic acid and amino acid sequences of the 
Type B fusion protdn, NSl(i-42)HA2(4i.223)- [SEQ ID NO: 13 & 14]. 

Fig. 5 illustrates the pOTS208NSlBLmut2 vector nucleic acid 
sequences [SEQ ID NO: 54] encoding the amino acid sequences [SEQ ID NO: 55] 

25 of the mutant NS(i-81)BLHA2(i.223)(niet-leu) fusion protein, with the nucleic 

acid sequences of the coding region NS(i.8l)BLHA2(i.223) [SEQ ID NO: 56] and 
native amino acid sequences [SEQ ED NO: 57], which include a Met in amino acid 
position 98> illustrated above the modified BLHA2 sequences. 

Fig. 6 illustrates the nucleic acid [SEQ ID NO: 17] and amino acid 

30 [SEQ ID N0:18] sequences of the HlNl fusion protein, NSl(i.81)HA2(65-222). 
also known as flu D. 

Fig. 7 illustrates the naturally-occurring nucleic acid sequence [SEQ 
ID N0:1] and corresponding amino acid sequence [SEQ ID N0:2] of the HA2 
portion of the H3N2 virus, A/Udom. 

35 Fig. 8 illustrates the naturally-occurring nucleic acid sequence [SEQ 

ID N0:3] and corresponding amino acid sequence [SEQ ID NO:60] of die HA2 
poition of the H3N2 virus, AA^ictoria. 
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Detailed Description of the Invention 

The present invention provides novel proteins, DNA sequences, 
pharmaceutical vaccine compositions, and methods of use thereof for conferring 
5 protection in vaccinated mammals against one strain, or desirably multiple strains, 
of influenza viruses. The proteins and vaccine compositions of the present 
invention demonstrate the ability to stimulate or produce a protective immune 
response which is capable of recognizing an influenza virus or influenza virus- 
infected cells and protecting the vaccinated mammal against disease caused thereby. 

1 0 This protective response is desirably a T ceU response, produced in the substantia] 
absence of vaccine-induced neutralizing antibody. 

While the proteins and DNA sequences specifically described herein 
are directed to the H3HA2 and BHA2 sequences originating from viral strains to 
which humans are susceptible, it is expected that similar sequences and molecules 

15 can be prepared for veterinary applications. For example, selected HA2 sequences 
obtained from Type A viral strains, e.g., H5HA2, H7HA2 and other strains of 
interest may be obtained following the teachings described herein for the 
exemplified H3HA2 and BHA2 sequences. One of skill in the art should understand 
that this invention is not limited to the exemplified protein and DNA sequences, 

2 0 even though the following disclosure is limited to the two latter sequences for 

simplicity. Such additional viral HA2 subunits are expected to share the biological 
characteristics of the execq)lified sequences. 

Thus, this invention provides a protein or fragment thereof 
characterized by an amino acid sequence derived from the HA2 subunit of an HA 
25 protein, e.g., from a H3N2 subtype virus. As used herein, a "fragment" of the HA2 
subunit is an amino acid sequence derived from the HA2 subunit which is 
characterized by having an immunogenic determinant of the HA2 subunit. Such a 
fragment is desirably at least about 8 amino acids in length. 

The H3 proteins of the invention are capable of inducing T helper 

3 0 cells, particularly cytotoxic T lymphocytes, in the absence of neutralizing 

antibodies. Among H3N2 subtype strains of influenza A include AAJdom and 
AA^ictoria viruses. Other H3N2 virus strains of influenza A may also produce HA 
proteins for use in vaccine compositions according to this invention. Fig, 1 
compares the nucleic acid sequences of the HA2 portions of the AAJdom [SEQ ID 
35 NO: 1] and A/Victoria [SEQ ID NO; 3] strains with the nucleic acid sequence of an 
HlNl subtype virus, A/PR/8/34 [SEQ ID NO: 5]. A consensus sequence [SEQ ID 
NO: 7] was computer generated, and may likewise be useful in producing proteins 
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according to this invention. This consensus sequence [SEQ ID NO: 7] can be 
constructed by a commercially available computerized sequence analysis program, 
such as Genetics Computers Group [University of Wisconsin]. 

Proteins according to this invention may include unfused HA2 
5 subunits of the influenza A viruses, particularly H3N2 subtype. For example, in one 
embodiment, a protein of the invention contains amino acids 1-221 of a selected 
H3HA2 subunit. In another embodiment, a protein of the invention contains amino 
acids 77-221 of the H3HA2 subunit. Other fragments of this HA2 amino acid 
sequence characterized by the ability to stimulate similar immunological activity in 

10 an immunized animal are also encompassed by this invention. 

Proteins of this invention also include fusion proteins comprising a 
protein sequence derived from the HA2 subunit of an HA protein from a Type A 
virus* e.g., an H3N2 subtype virus, fused in frame to another protein or protein 
fragment capable of enhancing expression of the fusion protein. It is desirable thai 

1 5 this fusion "partner'* protein be an influenza protein sequence or fragment thereof 
derived from the same or another strain of influenza virus as the HA protein or 
protein fragment Preferably, this fusion partner protein is all or a portion of the 
influenza virus NSl protein or an HA2 subunit protein. 

In the embodiments exemplifled herein, the NSl portion of the 

2 0 fusion protein is derived from an H 1 Nl subtype virus, A/PR/8/34. For example, in 
one embodiment, the NSl portion may comprise amino acid residues 1 to 42 of 
HINS 1 . In another embodiment the NS 1 portion may comprise amino acid residues 
1 to 81 of the selected virus. The HA2 fragment may alternatively be fused to a 
portion of die NSl peptide derived from a selected Type A virus, c,g., an H3 

2 5 subtype virus (H3HA2), or a Type B (BHA2) virus. 

However, other non-influenza fusion proteins may also produce 
desirable fusion proteins with die H3N2, or other Type A, or Type B protein or 
portion thereof. Thus, in still another alternative embodiment, as discussed below, 
the HA2 fragment may be fiised to any peptide capable of enhancing its expression 

30 in the host cell selected One of skill in the art may readily select a fusion "partner" 
protein or fragment taking into account the desired host cell and utilizing the 
teachings herein. The fusion proteins of the present invention are not limited by the 
selection of the "partner" protein or fragment to which die HA2 fragment is fused. 
In yet another embodiment, the present invention provides a 

35 modified protein containing a portion of the HA2 subunit of a Type B influenza 
virus. CurrenUy, the preferred human virus strain is B/Lee/40. However, the 
vaccinal proteins of this invention are not limited to tfiis Type B strain, and other 
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Strains infecting other species, or other as yet unidentified Type B virus strains, may 
be used to produce the HA2 protein. These Type B HA2 proteins may be fused to a 
fusion "parmer" protein or protein fragment, as described above for the H3HA2 
proteins of this invention, or remain unfused. 
5 In the construction of a fusion protein according to this invention, a 

linker sequence may optionally be inserted between the two fused sequences, i.e., 
between the NSl portion and the HA2 portion. This optional linker may provide 
space between the two linked sequences. Alternatively, this linker sequence may 
encode, if desired, a polypeptide which is selectively cleavable or digestible by 

1 0 conventional chemical or enzymatic methods. For example, the selected cleavage 
site may be an enzymatic cleavage site, including sites for cleavage by a proteolytic 
enzyme, such as enterokinase, factor Xa, trypsin, coUagenase, and thrombin. 
Alternatively, the cleavage site in the linker may be a site capable of being cleaved 
upon exposure to a selected chemical, e.g., cyanogen bromide or hydroxylamine. 

1 5 The cleavage site, if inserted into a linker useful in the fusion sequences of this 
invention, does not limit this invention. Any desired cleavage site, of which many 
are known in the an, may be used for this purpose. 

A presendy preferred example of an H3 fusion protein of this 
invention is NSl(i.8i)H3HA2(i-221) [SEQ JD NO: 10], which comprises die first 

20 81 amino acids of NS 1 fused to amino acids 1 to 221 of the H3HA2 subunit (amino 
acids 1-221). (Fig. 2) Another exemplary fusion protein, NS1(1-81)H3HA2(77-221) 
[SEQ ED NO: 12], comprises the first 81 amino acids of NSl fused to amino acids 
77 to 221 of die truncated H3HA2 subunit. (Fig. 3) 

A present preferred example of a Type B fusion protein of this 

25 invention is NSl(i.42)BHA2(4 1.223) [SEQ ID NO: 14], which comprises the first 
42 amino acids of NSl fused to amino acids 41 to 223 of die truncated BHA2 
subunit* (Fig. 4) Anotiier fusion protein of this invention is NSl(i-8l)BHA2(i- 
223) [SEQ ID NO: 57], which contains the first 81 amino acids of NSl fused to 
amino acids 1 to 223 of the BHA2 subunit. (Fig. S) Anoth^ preferred fusion 

30 protein of die invention is NSl(i.81)BHA2(i.223)(niet-leu) SEQ ID NO: 55, which 
contains the same amino acid sequence as NSl(i-81)BHA2(i.223)» with the 
exception that the internal methionine residue at position 98 of the fusion protein 
has been changed to a leucine. (Fig. 5) 

These proteins, fusion proteins, and similar proteins encoded by the 

3 5 below-described DNA sequences are referred to collectively herein as H3HA2 
proteins. 
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The NSl(i.81)H3HA2(i-221) protein [SEQ ID NO: 10] of the 
invention has a three-dimensional structure which is substantially similar to that of 
the NSl(i-8i)HA2(i-222) protein [SEQ ID NO: 16] derived from the HlNl 
subtype virus (CI 3). However, the amino acid sequence of the NSl(i- 
5 8i)H3HA2(i.221) protein [SEQ ID NO: 10] has only approximately 50% 
homology with the andno acid sequence of CI 3 protein [SEQ ID NO: 16]. 
Additionally, as illustrated in Fig. 1, the nucleic acid sequence of the PI3HA2i.221 
protein derived from AAJdom (nucleotides 23-560 from that virus) [SEQ ID NO: 1] 
has only syyproximately 60% homology with the nucleic acid sequence of the 

10 H1HA2 1.222 protein derived from strain A/PR/8/34 (nucleotides 1872-2407 from 
A/PR/8/34) [SEQ ID NO: 5]. However, the nucleic acid sequence of H3HA2i.221 
from AAJdom (nucleotides 1-499 of AAJdom) [SEQ ID NO: 1] has approximately 
99% homology with the nucleic acid sequence of H3HA2i-221 ^om 
AA^ictoria/H3/75 (nucleotides 1226-1725 of A/Victoria) [SEQ ID NO: 3] [Fiers et 

15 al, Cell, 12:683-696 (1980)]. 

Analogs of the HA2 peptides from a Type A virus, e.g., an H3, or 
Type B viruses, included within the definition of this invention, include truncated 
polypeptides (including fragments) and HA2 polypeptides, e.g. mutants that retain 
the epitopes and thus the biological activity of HA2. It is anticipated that, because 

2 0 the NS 1 portion of the fusion peptide provides a means of expressing the protein at 

high levels and does not appear to play as significant a role in the immunological 
responses to the HA2 fusion proteins as does the HA2 portion, any number of 
analogs of this fusion partner can be made. 

Typically, the analogs of the HA2 peptides and/or the fusion partner 
25 differ by only 1 to about 4 codon changes. Other examples of analogs include 

polypeptides with minor amino acid variations from the natural amino acid sequence 
of HA2; in particular, conservative amino acid replacements. Conservative 
replacements are those that take place within a family of amino acids that are related 
in their side chains. Genetically encoded amino adds are generally divided into 

3 0 four families: (1) acidic = aspartate, glutamate; (2) basic = lysine, arginine, 

histidine; (3) non-polar = alanine, valine, leucine, isoleudne, proline, phenylalanine, 
methionine, tryptophan; and (4) uncharged polar = glydne, asparagine, glutamine, 
cysteine, serine, threonine, tyrosine. Phenylalanine, tryptophan, and tyrosine are 
sometimes classified joindy as aromatic amino acids. For example, it is reasonable 
35 to expect that an isolated replacement of a leucine with an isoleucine or valine, an 
aspartate with a glutamate, a threonine with a serine, or a similar conservative 
replacement of an amino add with a structurally related amino acid will not have a 
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significant effect on its activity, especially if the replacement does not involve an 
amino acid at an epitope of the HA2 polypeptide. The construction of such analogs, 
given the description herein and conventional methods of protein modification 
known to one of skill in the art, are believed to be encompassed by this invention. 
5 Currentiy, it is theorized that the HA2 portion of the fusion peptide 

(e.g„ H3HA2i.22h H3HA277.22I and BHA241.223) confers the majority of the 
necessary epitopes for antibody binding or T cell (particularly CTL) targeting. 
Once these epitope sequences are precisely identified, portions of the HA2 sequence 
which are not part of these epitopes may be altered without significantly affecting 

10 the bioactivity of the fusion protein. 

The present invention also encompasses DNA sequences of this 
invention encoding the above-described proteins and fusion proteins, the sequences 
characterized by having an immunogenic determinant of a modified HA2 subunii of 
an HA protein, derived from a Type A virus, e.g., an H3 subtype, or Type B virus. 

15 Other DNA sequences of this invention encode such HA2 subunits, optionally fused 
to a DNA sequence encoding a protein or peptide which is capable of enhancing 
expression of the protein in a selected host cell. For example, the consensus 
sequence illustrated in Fig, 1(d) may provide a source of HA2 DNA. The cunenUy 
preferred embodiment provides a DNA sequence encoding a Type A vuus, e.g., an 

20 H3 or Type B HA2 protein or fragment thereof fused in frame to a DNA sequence 
encoding a portion of the nonstructural influenza protein 1 (NSl). 

Coding sequences for the HA2, NSl, and other viral proteins of 
influenza virus can be prepared synthetically or can be derived from viral RNA or 
from available cDNA-containing plasmids by known techniques. For example, in 

2 5 addition to the above-cited references, a DNA coding sequence for HA from the 

A/Japan/305/57 strain was cloned, sequenced and reported by Gething et al, Nature . 
222:301-306 (1980). An HA coding sequence for strain A/NT/60/68 was cloned as 
reported by Sleigh ct al, and by Both et al, in Development s in Cell Biology , 
Elsevier Science Publishing Co., pages 69-79 and 81-89, respectively, (1980). An 

30 HA coding sequence for strain A/WSN/33 was cloned as reported by Davis et al, 
Qfillfi. IQ:205-218 (1980); and by Hiti ct al, yimlagx, JJIrl 13-124 (1981). An HA 
coding sequence for fowl plague virus was cloned as reported by Porter et al and by 
Emtage et al, both in Developments in Cell Biolofv. cited above, at pages 39-49 and 
157-168, Also, influenza viruses, including other strains, subtypes, and types are 

35 available from clinical specimens and from public depositories, such as the 
American Type Culture Collection (ATCC). Rockville, Maryland, U.S.A. 
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Allelic variations (naturally-occurring base changes in the species 
population which may or may not result in an amino acid change) of DNA 
sequences encoding the H3HA2 or BHA2 protein sequences are also included in the 
present invention, as well as analogs or derivatives thereof. Similarly, DNA 
sequences which code for H3 or other Type A or Type B HA2 proteins of the 
invention but which differ in codon sequence due to the degeneracies of the genetic 
code or variations in the DNA sequence encoding H3HA2, other Type A or BHA2 
proteins which are caused by point mutations or by induced modifications to 
enhance the activity, halMife or production of the peptide encoded thereby are also 
encompassed in the invention. Suitably, this invention provides certain silent 
mutations to die coding sequences for NS l(i.8i)H3HA2(i.221)» which have been 
found to increase expression yields. See Fig. 2. Further, the NSl(i.8i)BHA2(i. 
223)(met-leu)-encoding sequence. BC13mut2, in addition to modifying the codon 
encoding amino acid position 98 of the fusion protein (position 17 of the HA2 
portion), contains a number of silent modifications designed to increase protein 
expression. See Fig. 5, 

Also covered by this invention are DNA sequences which hybridize 
under stringent conditions witii the DNA sequences encoding tiie HA2 subunit 
proteins, e.g., H3HA2 or BHA2 proteins, of tiiis invention. DNA sequences which 
hybridize under non-stringent conditions with the disclosed sequences, but which 
encode proteins or fragments retaining the biological activities of the H3HA2 or 
BHA2 proteins, are also included in this invention. Typical conditions for stringent 
or non-stringent hybridization are known to tiiose of skill in the art. [See, e.g., 
Sambrook et al. Molecular Cloning. A Laboratory Manual, 2nd edition. Cold 
Spring Harbor Laboratory, NY (1989)]. 

The fusion proteins of tiie invention may be prepared by 
conventional genetic engineering and recombinant techniques known to those of 
skill in tiie art. Similarly, die proteins may be purified fi-om expression in host cell 
or vector systems by conventional means. 

Preferably, however, tiie recombinandy-produced fusion proteins of 
tfie invention are purified as described herein. Generally, metiiod of purification 
involves (step 1) tiie isolation of tfie proteins, (step 2) enzymatic digestion and 
extraction, (step 3) urea extraction, (step 4) solubilization, reduction, and DEAE 
chromatography, (step 5) reverse phase chromatography, (step 6) precipitation, and 
(step 7) desalting and preparation of die final product. More specifically, tiie host 
cells containing the fusion proteins are disrupted, either chemically or by 
mechanical means. Preferably the cells are lysed by osmotic shock. Following 
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centrifugation, the resulting pellet (PI) is subjected to nuclease digestion extraction 
and centrifuged to yield pellet 2 (P2). A second extraction step is then performed 
using urea (pH 6) and the mixture centrifuged to yield pellet 3 (P3). P3 is then 
solubilized and reduced. Preferably, solubilization is performed using urea at pH 
5 12.5 and reduction is via DTT DEAE chromatography followed by SDS elution. 
The resulting DEAE pnxiuct is further reduced, preferably using DTT, and 
subjected to reverse phase chromatography. The reverse phase product is then 
precipitated by adjusting to pH 6 and centrifuged. The precipitated product is 
resolubilized, preferably with urea at pH 12.5, and subjected to G25 

1 0 chromatography. The resulting G25 product is then filtered (e.g. with a 0.2 micron 
filter) to yield the final product Further details of this method are provided in 
Example 17 below. 

Systems for cloning and expression of the vaccinal polypeptide of 
this invention in various microorganisms and cells, including, for example, E. coli . 

1 5 Barilltis, Streptomvces. Saccharomyces. mammalian and insect cells, are known and 
available from private and public laboratories and depositories and from commercial 
vendors. The preferred host is coli because it can be used to produce large 
amounts of desired proteins safely and cheaply. To circumvent the requirement of 
ampicillin for plasmid selection in production fermentations, a desirable method of 

2 0 production employs an alternative expression system in which the P-Iactamase 
coding sequence is wholly or partially replaced by a coding sequence for an 
alternative selectable marker such as, for example, kanamycin or chloramphenicol. 

Thus, the polypeptide employed in the presentiy preferred 
embodiment is preferably expressed in E. coli . A suitable strain, LW14, has tiie 

25 following genotype: galE::TnlOXCI857 bio- uvrB-; phcnotypically, strain LW 14 
requires biotin for growdi, is sensitive to UV light and DNA damaging agents, and 
cannot use galactose as a carbon source. Construction of this strain is described in 
the examples below. 

To aid in expression of the H3 or other Type A subunit or Type B 

30 HA2 peptides or fusion protein described above, these protein sequences or 
fragments thereof may also be fused to a polypeptide capable of enhancing 
expression of these fragments in the selected host system. Ordinarily, such a 
peptide would contain a leader sequence fragment that provides for secretion of the 
Type A subunit fragment, e.g., the H3HA2 fragment, or Type B HA2 fragment in 

35 the host cell. The leader sequence fragment typically encodes a signal peptide 
comprised of hydrophobic amino acids which direct the secretion of tiie protein 
from the cell. There may be processing sites encoded between the leader sequence 
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and the Type A subtype or Type B HA2 fragment that can be cleaved either in vivo 
or in vitro . Alternatively, a promoter sequence may be linked directly with the 
DNA molecule encoding the HA2 fragment. Such polypeptides, promoter and 
leader sequences are known to those of skill in the art and may be readily selected 
5 for expression in the selected host. 

Construction of expression systems, including expression vectors and 
transformed host cells are thus within the an. See, generally, methods described in 
standard texts, such as Sambrook et al. Molecular Cloninp A Laboratory Manual. 
2d edit.. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1989). The ' 

1 0 present invention is therefore not limited to any particular expression system or 
vector, nor to any particular purification process from cell lysates or cell medium. 

The proteins and fusion proteins of this invention may be employed 
in vaccine compositions. Pharmaceutical vaccine compositions of this invention, 
therefore, contain an effective immunogenic amount of a selected HA2 protein, e.g., 

15 H3HA2 or BHA2 protein, of the invention in admixture with a suitable adjuvant in a 
nontoxic and sterile pharmaceutically acceptable carrier. 

Suitable carriers for vaccine use are well known to those of skill in 
the an. However, exemplary carriers include sterile saline, lactose, sucrose, calcium 
phosphate, gelatin, dextrin, agar, pectin, peanut oil, olive oil, sesame oil, squalene, 

2 0 and water. Additionally, the carrier or diluent may include a time delay material, 
such as glyceryl monostcarate or glyceryl distearate alone or with a wax. 
Optionally, suitable chemical stabilizers may be used to improve the stability of the 
pharmaceutical preparation. Suitable chemical stabilizers are well known to those 
of skill in the an and include, for example, citric acid and other agents to adjust pH, 

2 5 chelating or sequestering agents, and antioxidants. 

While any aluminum adjuvant may be used in the vaccine 
compositions of this invention, two desirable adjuvants are available commercially, 
i.e., REHSORPTAR™ adjuvant [Armour Pharmaceuticals, Kankakee, DL] and 
REHYDRAGEL™ adjuvant [Reheis Chemical Co,, Berkeley Heights, NJ]. These 

3 0 products are aluminum hydroxide gels which contain approximately 2% w/v AI2O3, 

which is equivalent to approximately 10.6 mg/ml Al"*"^. 

Vaccine compositions of this invention may employ an immunogenic 
amount of a purified recombinant protein as described above. A preferred 
embodiment of the vaccine of the invention is composed of an aqueous suspension 
35 or solution containing the recombinant HA2 protein molecule, e.g., H3HA2 or 
BHA2, togetiier with an adjuvant, preferably an aluminum, most preferably 
aluminum hydroxide, buffered at physiological pH, in a form ready for injection. A 
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preferred protein for use in these vaccine compositions includes a protein 
comprising amino acid residues 1 to 81 from NSl fused to C-terminal amino acid 
residues 1-221 from the hemagglutinin subunit 2 (HA2) from influenza A, subtype 
H3N2. Another preferred vaccine composition of this invention employs a purified 
5 recombinant protein made up of amino acid residues 1 to 81 from NSl fused to 
amino acid residues 77-221 of the HA2 from influenza A, subtype H3N2. Still 
another preferred vaccine composition of this invention employs a purified 
recombinant protein made up of amino acid residues 1 to 42 fused to amino acid 
residues 41-223 of the HA2 from influenza B. 

1 0 Vaccine compositions of the invention may also employ an 

immunogenic amount of a recombinant protein of the invention in combination with 
other influenza antigens. Suitable influenza antigens for combination in a vaccine 
composition with the proteins of this invention may be derived from Type A, HI 
subtype viruses and may include the recombinant fusion proteins described in detail 

15 in copending U. S. Patent Application Ser. No. 07/387,200, filed July 28, 1989 and 
its corresponding European Patent Application No. 366, 238, published May 2, 
1990; and in co-pending U. S. Patent Application Ser. No. 07/387,558, filed July 
28, 1989 and its corresponding European Patent Application No. 366,239, published 
May 2, 1990. The C13 protein (NSl(i.81)HA2(i.222)) [SEQ ID NO: 15 & 16]. D 

20 protein (NSl(i-81)HA2(65.222)) [SEQ ID NO: 17 & 18] and other fusion proteins 
derived from the HlNl influenza virus subtype and the recombinant expression and 
purification thereof are disclosed in detail in these applications, and in the parent 
applications identified in this application, all of which are incorporated by reference 
herein. 

25 More specifically, suitable HI subtype immunogenic proteins include 

C13(NSl(i-81)-D-L-S-R-HA2(i.222)) [SEQ ID NO: 15 & 16].D (NSl(i-81)-Q- 
I-P-HA2(65-222)) [SEQ ED NO: 17 & 18], C13 short (NSl(i.42)-M-D-L-S-R- 
HA2(i-222)) [SEQ ID NO: 19 & 20], D short (NSl(i.42)-M-D-H.M-L-T-S-T-R-S- 
HA2(66-222)) [SEQ ID NO: 21 & 22], A (NSl(i-81)-Q-I-P-HA2(69.222)) [SEQ 

30 ID NO: 23 & 24], C (NSl(i.8l)-Q-I-P-HA2(8l.222)) [SEQ ID NO: 25 & 26], AD 
(NSl(i.81)HA2(i50.222)) [SEQ ID NO: 27], A13 (NSl(i.81)-D-L-S-R-HA2(i. 
70)-S-C-L-T-A-Y-H-R) [SEQID NO: 28], M (NSl(i.81)-Q-I-P-HA2(65-l96)-G- 
G-S-Y-S-M-E-H-F-R-W-G-K-P-V) [SEQ ID NO: 29], AM (NS 1(1-8 1 )-Q-I-P- 
HA2(65-196)-G-G-S-Y-S-M-L-V-N) [SEQ ID NO: 30], AM+ (NSl(i,81)-Q-I-P- 

35 HA2(65-200)-L-V-L-L) [SEQ ID NO: 31 & 32]. These HlNl fusion proteins are 
described in published European Patent Application 366,238 and in copending U.S. 
Patent Application Ser. No. 07/751,896. Other suitable HI proteins consist of 
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unfused polypeptides, such as H1HA266-222 [SEQ ID NO: 33 & 34] which is 
disclosed in co-pending U. S. Patent Application Ser. No. 07/751,898, incorporated 
herein by reference. Thus, one desirable combination vaccine to provide protection 
against Type A influenza contains NSl(i.8i)H3HA2(i.22l) protein [SEQ ID NO: 
5 9 & 10] of the invention, one or more proteins derived ftom subtype HlNl as 
described abofve, and an aluminum adjuvant 

Preferably, a combination vaccine of the invention will contain an 
immunogenic amount of the H3 fusion protein of the invention in combination with 
immunogenic amounts of influenza antigens derived from the odier Type A 
1 0 influenza virus subtypes, including among others. HI , H2. H3, H4. H5, H6; and H7, 
as wen as a Type B fusion protein of the invention. 

A currendy preferred combination vaccine of the invention contains 
the H3 subtype fusion protein NSl(i.8i)H3HA2(i.221) [SEQ ID NO: 10], the B 
subtype fusion protein NSl(i.8i)BHA2(i.223)(tnet-leu) [SEQ ID NO: 55], and the 
15 HI subqrpe fusion protein NSl(i.8i)HA2(65-222) [SEQ ID NO: 18]. Studies have 
shown that such a combination vaccine is protective against challenge with HI, H3 
and Type B influenza viruses in mice. 

Other preferred combination vaccines would include the NSl(l- 
81)H^HA2(77.221) protein [SEQ ID NO: 12] or die NSl(i.8i)BHA2(i.223) [SEQ 
20 ID NO: 57] in combination with one or more additional influenza antigens derived 
from the type or subtype influenza viruses described above. These combination 
vaccines wiU protect against influenza infections caused by both Type A and Type 
B influenza viruses. Still other combination vaccine compositions vnll employ 
other proteins described herein. 
25 The compositions of the present invention are advantageously made 

up in a dose unit form adapted for the desired mode of administration. Each unit 
will contain, at a minimum, a predetermined quantity of the selected HA2 subunit 
protein, e.g., H3HA2 protein and/or BHA2 protein, and adjuvant calculated to 
produce the desired therapeutic effect in optional association with a phannaceutical 
3 0 diluent, carrier or vehicle. 

Dosage protocol can be optirinizcd in accordance with standard 
vaccination practices. Typically, the vaccine will be administered innamuscularly, 
altijough odier routes of administration may be used, such as intradeimal. It is 
expected that an effective immunogenic amount of a protein, fusion protein or 
3 5 combination of proteins of this invention for average adult humans is in the range of 
1 to 1000 micrograms. Another desirable immunogenic amount ranges between 50 
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to 500 micTograms. Most preferably, the proteins of the invention are in admixture 
with the same amount or more adjuvant to form a vaccine composition. 

While the proteins described herein have been particularly developed 
for use in humans (e.g., the H3HA2 and BHA2 sequences), it is expected that due to 
5 species cross-reactivity, these vaccines will be useful in other animals, particularly 
swine. Additionally, similar molecules can be prepared for equine and avian 
veterinary applications utilizing the HA2 proteins from other strains to which 
animals are suscq>tible. Combination vaccines for use in swine would preferably 
include protections against both HI and H3 viruses. Combination vaccines for use 
10 in equine would preferably include protection against H3 and H7 viruses. 

Combination vaccines for use in avian species would preferably confer protection 
against H5 and H7 viruses. Appropriate dosages can be determined by one skilled 
in veterinary medicine. 

It will be understood, however, that the specific effective 
1 5 inmiunogenic amount for any particular patient will depend upon a variety of factors 
including the age, general health, sex, and diet of the vaccinee; the species of the 
vaccinee; the time of administration; the route of adminisn^tion; interactions with 
any other drugs being administered; and the degree of protection being sought. 

The vaccine can be administered initially in late summer or early fall 
20 and can be readministcred two to six weeks later, if desirable, or periodically as 
immunity wanes, for example, every two to five years. Of course, as stated above, 
the administration can be repeated at suitable intervals if necessary or desirable. 

The present invention provides methods for producing enhanced 
expression and improved homogeneity of influenza viral proteins and polypeptides 
25 in £. colL Also provided are novel modified nucleotide sequences which encode 
these influenza proteins and are useful in the methods of production. 

Preferably, the influenza proteins or polypeptides produced according 
to die invention include the complete HA2 protein of the hemagglutinin antigen 
(HA) of a selected H3N2 influenza virus, a complete HA protein of an H3HA2 
3 0 virus, fragments thereof, and fusion proteins containing the complete H3HA2 

protein or desired fragments thereof fused in the same reading frame with a selected 
fusion parmer polypeptide or protein. These proteins are characterized by having 
the native amino acid sequence pattern described above. 

By the term "fragment" is meant a subunit of HA, or a span of 
35 contiguous amino acids from the complete protein capable of stimulating an 
antigenic or protective immunogenic response in an animal. A fragment may 
contain at least about 8 amino acids from the selected influenza protein, and can 
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contain up to the number of amino acids which make up the entire protein. When 
the term 'fragment* is used herein to modify a nucleotide sequence, it refers to 
nucleotide sequences which encode the above-defined amino acid fragments. 

Native (or naturally-occurring) nucleotide sequences which encode 
5 certain influenza proteins are characterized by a nucleotide sequence pattern 
encoding the fragment Arg-Arg-Xaa-Xaa-Arg [SEQ ID N0:61]. Arg represents 
arginine and Xaa represents any amino acid in this formula. Hereafter, this five 
amino acid sequence is leferred to as Formula L 

Fomiula I sequences are typically encoded by native nucleotide 

1 0 sequences of the fomaula of codons AGr- AGr-Xaa-Xaa-AGr, where r represents the 
nucleotides A or G and Xaa represent any codon [SEQ JD NO:63]. Hereafter, this 
five codon nucleotide sequence is referred to as Formula 11. Specifically, the native 
nucleic acid sequence encoding a subtype H3N2 influenza virus protein, fusion 
protein, or a fi-agment or subunit thereof, specifically the HA2 portions of H3N2 

1 5 virus strains, is characterized by a Formula U sequence. 

Among H3N2 subtype strains of influenza A characterized by this 
nucleotide fragment Formula II include the AAJdom and AA^ictoria viruses. Figs. 7 
and 8 provide the native nucleic acid sequences of the HA2 portions of the AAJdom 
[SEQ ID NO: 1] and A/Victoria [SEQ ID NO: 3] strains. Other H3N2 virus strains 

20 of influenza A may also provide native nucleotide sequences containing Formula 11, 
which sequences are susceptible to the modifications described herein. 

Additional examples of native nucleotide sequences encoding 
proteins whose expression may be enhanced according to this invention are those 
native sequences which encode certain fragments of influenza proteins including the 

25 fragment spanning amino acids 1 to about amino acids 221 of H3HA2 [Fig. 7 SEQ 
ID N0:2 and Fig. 8 SEQ ID N0:3]; the fragment spanning from about amino acid 
77 to about amino acid 221 [Fig. 7 SEQ ID NO:69 and Fig. 8 SEQ ID NO:70], or 
otiier desirable fi:agments. Other desirable fragments of this H3HA2 amino acid 
sequence include those characterized by the ability to stimulate immunological 

3 0 activity in an immunized animal similar to that stimulated by use of the entire 221 
amino add sequence of H3HA2. 

Nucleotide sequences encoding fusion proteins which contain 
fragments of the native nucleotide sequences encoding these influenza proteins or 
subunits, e.g., tfie fusion protein NSl(i-81)H3HA2(i.221) [SEQ ID NOrlO], can 

35 also be characterized by the Formula 11 nucleotide sequence. Thus these fusion 
proteins are also desirable for enhanced expression according to the method of this 
invention. 
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The inventors have discovered that when nadve nucleotide sequences 
of influenza proteins, which sequences comprise Formula are expressed in E. 
coli, a frame shift of one nucleotide after the third triplet in Formula n in the nadve 
sequence occurs, resulting in the increased translation of truncated proteins. It has 
5 been surprisingly found that by application of a method of the present invention, the 
expression and homogeneity of the influenza protein is increased significantly. 

The methods of this invention involve enhancing the expression of 
proteins characterized by the amino acid pattern of Formula I, which proteins have a 
native nucleotide sequence of Formula n. According to one embodiment of the 

1 0 method of this invention, a native nucleotide sequence encoding a selected influenza 
protein or fragment, which sequence comprises Formula II, is modifled by mutating 
one or more of the rare AGO or AG A arginine codons of Formula II to a preferred 
Arg codon. A preferred arginine codon for use in replacing a native AGA or AGG 
codon according to this invention is defined herein by tiie codons CGT, CGG, CGA 

15 and CGG. Of tiiese codons, CGT and CGG are currently the most preferred. The 
modified influenza protein-encoding nucleotide sequence is then expressed in an £. 
coli expression system, resulting in enhanced expression in comparison to that 
obtained by expression of the native nucleotide sequence encoding the same protein 
in the same expression system. 

2 0 The enhanced protein expression occurs even though the mutation 

does not result in a change in the encoded amino acid sequence of the protein. By 
the tenns 'enhanced expression' or 'enhanced protein expression* is meant an 
expression level of at least 40% higher than the expression level of the protein 
encoded by the native, non-mutated nucleotide sequence comprising Formula 11, 

2 5 when expressed in £, colL 

While not wishing to be bound by tiieory, the inventors believe that 
the enhanced expression levels are obtained because the silent mutation of tiie AGA 
or AGG to a preferred arginine codon in Formula 11 eliminates the frame shift 
mutation found in the unmutated nucleotides encoding these proteins, thus 

30 substantially reducing the production of truncated messages (proteins). It is 
believed that the resulting influenza proteins arc more homogeneous when 
expressed in an £. coli expression system according to this invention. 

In a second embodiment of the method of the invention, the 
expression of the proteins containing arginines encoded by the rare codons AGG 

35 and AGA (i.e. proteins encoded by amino acid and nucleotide sequences 

characterized by Formulae I and H) can be increased by inserting into the host in 
which expression is desired one or more genes for tRNA molecules which arc 
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capable of properly translating the AGG and AGG arginine codons. Preferably the 
host cells are £. colL 

This method can be accomplished as follows. A gene for a tRNA 
molecule described above can be selected from among known gene sequences. The 
5 genes and tRNA molecules which can translate the rare Arg codons identified above 
are known and readily available to one of skill in the art. See, e.g., [P. Saxena and 
J. Walker. L BacterioL . 174(6): 1956-1964 (Mar. 1992)], 

Accoiding to conyentional techniques, these genes may be placed on 
a plasntud which will increase the copy number of these genes and therefore the 

1 0 tRNA molecules encoded by these genes. Alternatively, these sequences can be 
genetically engineered and placed on the host cell chromosome behind an 
appropriate promoter element in such a manner that the effective concentration of 
these tRNA molecules is increased inside the cell. Conventional texts describe the 
techniques useful in this method [See, e.g., Sambrook et al., Molecular Cloning. A 

15 Laboratory Manual. 2d edition, Cold Spring Harbor, New York (1989)]. 

The insertion of the tRNA genes into the host cell expressing the 
protein increases the concentration of these tRNA molecules inside, the host cells 
which are naturally deficient for these tRNA molecules. This allows the host cells 
to translate these rare arginine codons in an efficient manner, eliminating the 

2 0 production of the truncated or lower molecular weight species of the fusion protein 
observed in the unmodified host cell. Thus, this method may be used to increase 
expression of a protein in host cells lacking sufficient amounts of the appropriate 
tRNA to permit efficient expression of the protein. Use of this method obviates the 
need to modify the sequences encoding the selected protein, and thus provides an 

2 5 alternative method to the first embodiment described above. 

. As another aspect of this invention novel modified nucleotide 
sequences are provided, which in E, coli expression systems, can be employed to 
produce the encoded influenza proteins, subunits, fragments and fusion proteins 
described above according to the first embodiment of the method of this invention. 

3 0 The proteins encoded by these nucleotides are produced at levels of expression 

enhanced over that of the native sequences, by about forty percent or more. The 
novel nucleotide sequences of the invention are characterized by comprising the 
nucleotide sequence CGn-CGn-Xaa*Xaa-CGn, where n represents a nucleotide 
selected from the group consisting of T, C, A or G [SEQ E) NO:62], in place of the 
35 Formula 11 fragment in the native nucleotide sequence encoding the selected 

influenza protein or fragment. The nucleotide fragment identified by the formula 
above is referred to herein for simplicity as Formula in. 
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For example, a modified DNA sequence of the invention comprises 
the Formula HI nucleotide sequence and may encode the amino acid sequences 
identified specifically above, e.g.. Fig. 7 [SEQ ID N0:2], Fig. 8 [SEQ ID N0;3]; 
Fig. 7 [SEQ ID NO:69] and Fig. 8 [SEQ ID NO:70], or other fragments. 
5 In one example of the present invention, the nucleic acid sequence 

encoding the HA2 subunit protein which contains the native sequence of Formula II 
has been provided with three silent mutations, which have changed each of the three 
native arginine-encoding AOG codons each to a preferred arginine codon CGT. 
These codons encode amino acid numbers 123, 124 and 127 of the H3HA2 subunit 

1 0 protein of the AAJdom strain identified in Fig. 7. The same codons (arid amino acid 
numbers) are altered in the AA^ictoria strain identified in Fig. 8 to provide another 
example of a modified nucleotide sequence according to this invention. 

Thus, with reference to each of Figs. 7 and 8, the native nucleotide 
sequences encoding the HA2 subunit proteins of the aforementioned viruses [SEQ 

15 ID N0:1 and 60], are modified according to this invention at nucleotides 367. 370. 
and 379. At each of these nucleotide sites* the native A (adenine) is changed to a C 
(cytosine) and the native nucleotides at sites 369, 372 and 381 in each sequence are 
changed from a G (guanine) to a T (thynune), resulting in preferred Arg codons. 
Other nucleotide sequences encoding the influenza vaccinal 

2 0 polypeptides described herein, or other such influenza proteins or subunits 

characterized by Formula II may be mutated into novel nucleotide sequences of this 
invention, i.e., by mutating Formula E into Formula HI within those sequences 
using the first embodiment of the methods of this invention. The silent mutations 
described herein may be inserted at analogous regions in each nucleotide sequence. 

2 5 The novel modified H3HA2 nucleotide sequences, whether alone or 

in association with a nucleotide sequence encoding a fusion parmer of a fusion 
protein of the invention are useful in E. coli expression systems. The novel 
nucleotide sequences of the invention will also encode analogs of the H3HA2 
peptides, such as truncated polypeptides (including fragments) and H3HA2 

3 0 polypeptides, e.g. mutants that retain the epitopes and thus the biological activity of 

H3HA2. Where the nucleotide sequence encodes a fusion protein, it is anticipated 
that, because the non-HA2 fusion partner, e.g., NSl as described below, the fusion 
peptide provides a means of expressing the protein at high levels and does not 
appear to play as significant a role in the immunological responses to the HA2 
35 fusion proteins as does the HA2 portion, any number of analogs of this fusion 
partner can be made. 
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Typically, the analogs of the nucleotide sequences encoding the HA2 
peptides and/or the fusion partner may differ by only 1 to about 4 codon changes, in 
addition to the nucleotide mutations to the above-identified fragment. Other 
sequences of this invention include modified nucleotide sequences which encode 
5 polypeptides with minor amino acid variations from the natural amino acid sequence 
of HA2. For example, conservative amino acid replacements may be introduced by 
altering, deleting or replacing codons of the native sequence, in addition to altering 
those codons in Formula n according to one embodiment of this method. 

Conservative replacements are those that take place within a family 
10 of amino acids that are related in their side chains and are well known in the art. 
For example, it is reasonable to expect that an isolated replacement of a selected 
amino acid with a conservative replacement of an amino acid with a structurally 
related amino acid will not have a significant effect on the activity of die protein, 
especially if the replacement does not involve an amino acid at an epitope of the 
1 5 HA2 polypeptide. 

The construction of modified nucleotide sequences and proteins or 
fusion proteins, given the description herein and conventional methods of protein 
modification known to one of skill in the art, are believed to be encompassed by this 
invention. 

20 The novel modified nucleotide sequences of this invention are further 

characterized by encoding an immunogenic determinant of a modified HA2 subunit 
of an HA protein, derived from an H3N2 subtype. The encoded protein may 
contain all or a portion of the H3N2 HA2 sequence, including the Formula I amino 
acid sequence. The currently preferred embodiment provides a novel DN A 

25 sequence encoding an H3HA2 protein or fragment thereof fused in frame to a DNA 
sequence encoding a portion of the nonstructural influenza protein 1 (NSl). One 
modified fusion protein-encoding nucleotide sequence is obtained by making 
mutations according to this invention in the nucleotide sequence encoding the fusion 
protein NSl(i.81)H3HA2(i.221) [SEQ ID NOrlO]. Upon mutation, the nucleotide 

30 sequence [SEQ ID NO:58] for this fusion protein [SEQ ID NO: 101 is referred to 
herein as pOTS208NSlH3mut5585. 

The modified coding sequences for the HA2 proteins, as well as the 
coding sequences for NS 1 and other viral proteins of influenza virus can be prepared 
synthetically or can be derived from viral RNA or from available cDNA-containing 

35 plasmids by known techniques. For example, see references known to the art which 
disclose the nucleotide coding sequences for HA from the A/Japan/305/57 strain 
[Gediing et al., Namre. 222:301-306 (1980)]; strain A/NT/60/68 [Sleigh et al, and 
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Both et aL, in Development s in Cell Biology. Elsevier Science Publishing Co.. 
pages 69-79 and 81-89, respectively, (1980)]; strain A/WSN/33 Pavis et al., fisnfi. 
1Q:205-218 (1980); Hiti etai, Virology . 111:113-124 (1981)]; and fowl plague 
virus [Porter et aL and by Emtage et al., both in Developments in Cell Biology, 
5 cited above, at pages 39-49 and 157-168], Also, influenza viruses, including other 
strains, subtypes and types, are available from clinical specimens and from public 
depositories, such as the American Type Culture Collection (ATCC), Rockville, 
Maryland, U.S.A. 

Novel modified nucleotide sequences of this invention may also 

1 0 include allelic variations (naturally-occuiring base changes in the species population 
which may or may not result in an amino acid change) of DNA sequences encoding 
the H3HA2 protein sequences, and the Formula III fragment [SEQ ID NO:62]. 
Similarly, DNA sequences having the Formula HI fragment, which sequences 
encode other H3N2 HA2 proteins of the invention include sequences which differ in 

1 5 codon sequence outside of Fomula 11 due to degeneracies of the genetic code or 
variations in the DNA sequence encoding H3HA2 proteins. Such codon differences 
may be caused by point mutations or by induced modifications to enhance the 
activity, half-life or production of the peptide encoded thereby. Also covered by 
this invention are DNA sequences characterized by the above modification of 

2 0 Formula n into Formula m, which hybridize under stringent conditions with the 

DNA sequences encoding the HA2 subunit proteins, e.g., H3HA2 proteins, of this 
invention. DNA sequences which hybridize under non-stringent conditions with the 
disclosed sequences, but which encode proteins or fragments retaining the biological 
activities of the H3HA2 proteins, are also included in this invention. Typical 
25 conditions for stringent or non-stringent hybridization are known to those of skill in 
tiie art [See, e.g., Sambrook et al,, cited above]. 

The actual techniques for producing the mutations described herein 
are now conventional to the art of genetic engineering, and are readily known and 
available to one of skill in the art See, e.g., Sambrook et al, cited above. Such 

3 0 conventional techniques include, for example, site directed mutagenesis, which is 

available in commercial kits from, e.g. Qonetech and Promega Corporation. Other 
suitable techniques include, e.g., total gene synthesis and removing the fragment and 
replacing it with a synthetically generated, mutated fragment It is anticipated that 
similar modifications to any H3HA2 sequence having an analogous codon pattern 
35 will result in the enhanced expression in E. coli, exemplified by the modified 
H3HA2 sequence. 
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The muiauons described herein are preferentially developed for 
increased expression of the influenza protein or fusion protein in E. coli, which is 
the preferred host because it can be used to produce the desired proteins safely and 
cheaply. To circumvent the requirement of ampicillin for plasmid selection in 
5 production fermentations, a preferred method of production which uses the modified 
nucleotide sequences of this invention employs an alternative expression system in 
which the ^-lactamase coding sequence is wholly or partially replaced by a coding 
sequence for an alternative selectable marker, such as, kanamycin or 
chloramphenicoL 

10 To aid in expression of the H3HA2 peptides or fusion proteins, these 

protein sequences or fragments thereof may also be fused to a polypeptide capable 
of further enhancing expression of these fragments in the selected host system. 
Ordinarily, such a peptide would contain a leader sequence fragment that provides 
for secretion of die H3HA2 subunit fragment, in the host cell. The leader sequence 

1 5 fragment typically encodes a signal peptide comprised of hydrophobic amino acids 
which direct the secretion of the protein from the cell. There may be processing 
sites encoded between the leader sequence and the H3HA2 fragment that can be 
cleaved cidier in vivo or in vitro. Alternatively, a promoter sequence may be linked 
direcfly with the DNA molecule encoding the H3HA2 fragment. Such polypeptides, 

2 0 promoter and leader sequences are known to those of skill in the art and may be 

readily selected for expression in die selected host 

Construction of bacterial expression systems, preferably £. coli 
expression systems, including expression vectors and transformed host cells are also 
within the skill of the art. See, generally, metiiods described in standard texts, such 
25 as Sambrook et al., cited above. The present invention is therefore not limited to 
any particular vector, nor to any particular purification process from cell lysates or 
cell medium. 

Influenza proteins encoded by the modified nucleotide sequence may 
be expressed in enhanced manner according to the first embodiment of the method 
30 of this invention, or the influenza proteins may be expressed in an enhanced manner 
by translation from their native sequences by die second embodiment of the method. 
AdditionaUy. the metiiods of this invention may be used to enhance the expression 
of a fusion protein which comprises a protein sequence encoded by the modified 
nucleotide sequence containing Formula HI in place of Formula 11 in the native 

3 5 nucleotide sequence encoding an HA2 subunit of an HA protein from an H3N2 

subtype virus, fused in frame to another protein or protein fragment (a "fusion 
partner") capable of enhancing expression of die fusion protein. 
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One of skill in the art may readily select a fusion partner protein or 
fragment taking into account the desired host cell, i.e., E. coli, and utilizing the 
teachings herein. For the puiposes of this invention, the H3HA2 fragment or 
sequence encoded by a modified nucleotide sequence as described above or the 
5 native sequence used in the second embodiment of this method may be fused to any 
peptide enable of further enhancing its expression in the host cell selected or of 
increasing its immunogenicity. The method of the present invention does not limit 
the nature of the "panner" protein or fragment to which the H3HA2 fragment is 
fused to provide the enhanced expression of the resulting fusion protein. 

1 0 For example, the influenza protein or fragment bearing the amino 

acid sequence of Formula I may be fused to a number of conventionally known and 
used "panner" proteins [See, general texts on expression such as Current Protocols 
in Molecular Biology, Vol. 2, suppl. 10, publ. John Wiley and Sons, New York, 
NY, pp. 16.4.M6.8.1 (1990); Smith et aU Gene , £Z:31-40 (1988); U. S. Patent No. 

15 4,801^36, among others]. However, it may be desirable that this fusion "partner" 
protein be an influenza protein sequence or fragment thereof derived from the same 
or another strain of influenza virus as the HA protein or protein fragment. 
Preferably, this fusion partner protein is all or a portion of the influenza virus NS 1 
gene or an H A2 subunit 

20 In such a fusion protein, a linker sequence may be inserted optionally 

between the two sequences, i.e., between the sequence encoding the fusion partner 
and die HA2 protein encoded by the modified nucleotide sequence of this invention 
or the native sequence for expression according to the second embodiment of the 
method. This optional linker may provide space between the two protein sequences; 

25 and may encode a polypeptide or contain a cleavage site, which is selectively 
cleavable or digestible by conventional chemical or enzymatic methods. An 
example of a fusion protein whose expression can be enhanced by a method of this 
invention is NSl(i.8i)H3HA2(i.221) illustrated in Fig. 2 [SEQ ID NO: 10], which 
comprises the first 81 amino acids of NSl (derived from an HlNl subtype virus, 

3 0 A/PR/8/34) fused to the sequences spanning amino acid 1 to 221 of die H3HA2 
subunit (amino acids 1-221) via an optional four amino acid linker sequence. 
Anodier exemplary fusion protein, NSl(i.81)H3HA2(77.221) SEQ ID NO:72, 
comprises the first 81 amino acids of NSl fused to the sequences spanning amino 
acid 77 to 221 of die mincated H3HA2 subunit. In other embodiments, the NS 1 

35 portion may comprise the sequence spanning amino acid residues 1 to amino acids 
42 of HlNl . The HA2 fragment may alternatively be fused to a portion of the NS 1 
peptide derived from a selected Type A virus, e.g., an H3 subtype virus (H3N2). 
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These proteins, their native nucleotide sequences, and their uses, are described in 
co-pending U.S. application 07/837,773, filed February 18, 1992, which is 
incorporated by reference« 

As described below in the examples, the host cells used to express 
5 these fusion proteins may be modified by the second embodiment of the method of 
this invention to contain tRNA molecules capable of translating the rare arginine 
codons of Formula 11. See, e.g., Example 25. Alternatively, the nucleic acid 
sequence encoding these and other suitable H3HA2 proteins or H3HA2-containing 
proteins, i.e. those comprising a native Formula n sequence [SEQ ID N0:9], may 

10 be modified by the first embodiment of the method of this invention to rq}lace 

Formula II with the Formula HI sequence to increase the expression of the encoded 
protein in E. coli according to the method of this invention. 

The proteins and fusion proteins whose expression is enhanced by the 
methods of this invention may be employed in vaccine compositions. Several of the 

1 5 specific influenza proteins or fusion proteins described herein, which are produced 
according to the methods of this invention, have demonstrated the ability to 
stimulate or produce a protective immune response capable of recognizing an 
influenza virus or influenza virus-infected cells and protecting the vaccinated 
mammal against disease caused thereby. This protective response is desirably a T 

2 0 cell response, produced in the substantial absence of vaccine-induced neutralizing 
antibody. Such H3HA2 proteins and fusion proteins are capable of inducing T 
helper cells, particularly cytotoxic T lymphocytes, in the absence of neutralizing 
antibodies. 

Pharmaceutical vaccine compositions can contain an effective 

2 5 immunogenic amount of a selected H3HA2 protein produced according to this 

invention or encoded by a modified nucleotide sequence of this invention in 
admixture with a suitable adjuvant in a nontoxic and sterile pharmaceutically 
acceptable carrier. Suitable carriers for vaccine use, as well as other vaccine 
formulation additives and adjuvants, are well known to those of skill in the art. See, 

3 0 e.g., European Patent Application No. 366, 238, published May 2, 1990; and 

European Patent Application No. 366,239, published May 2, 1990. Such 
compositions may be effectively administered to human and animal patients to 
induce the appropriate immune response. The details of dosage and treatment using 
such compositions are also described in the above-cited published patent 
35 applications. 

The following examples illustrate methods for preparing H3HA2 and 
BHA2 fusion proteins of the invention and demonstrate the subtype specific 
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protection against heterologous virus induced upon vaccination with the H3HA2 
proteins. The following examples also illustrate methods for preparing the modified 
DNA sequences of the invention. All of these examples are illustrative only and do 
not limit the scope of the invention. 

5 

EXAMPLE 1 - PLASMIP pMS3H3HA 

Plasmid pFV88 contains the entire 221 amino acid length HA from 
AAJdom, an H3 subtype virus [C J. Lai el al. Prw. NaU. ACftd, Sgj, USA, 22:210- 
214 (1980)], which HA nucleic acid sequence is illustrated in Fig. 1 [SEQ ID NO: 
1 0 1]. This plasmid was cut with Pst L The resulting 1900 bp fragment, which 

contains the entire HA (HAI and HA2) fragment and some GC tailing, was then 
inserted into pUClS [Bethesda Research Laboratories]. The resulting plasmid is 
lenned pMS3 or pMS3H3HA. 



15 EXAMPLE 2 -pMQI 

Plasmid pAPRSOl is apBR322-derived cloning vector which carries 
the NSl coding region (A/PR/8/34). It is described by Young et al, in The Origin of 
Pandemic In fluenza Viruses , ed. by W. G. Laver, Elsevier Science Publishing Co. 
(1983). 

2 0 Plasmid pAS 1 is a pBR322-derived expression vector which contains 

the Pl promoter, an N utilization site (to relieve transcriptional polarity effects in 
the presence of N protein), and the cll ribosome binding site including the cII 
translation initiation codon followed immediately by a BamHI site. It is described 
by Rosenberg et al, in Methods EnzvmoL. 101:123-138 (1983). 

2 5 Plasmid pAS 1 AEH was prepared by deleting a non-essential EcoRI- 

Hindin region of pBR322 origin from pASl. A 1236 base pair BamHI fragment of 
pAPR801> containing the NSl coding region in 861 base pairs of viral origin and 
375 base pairs of pBR322 origin, was inserted into the BamHI site of pAS 1 AEH. 
The resulting plasmid, pASlAEH/801, expresses authentic NSl (230 amino acids). 

3 0 The plasmid has an Ncol site between the codons for amino acids 8 1 and 82 and an 

Nrul site 3' to the NS sequences. The BamHI site between amino acids 1 and 2 is 
retained. 

Plasmid pMG27N, a pAS 1 derivative FMoL Cell. Biol. . 5: 101 5- 1024 
(1985)], was cut with BamHI and Sad and ligated to a BamHI/NcoI fragment 
35 encoding the first 81 amino acids of NS 1 from pAS 1 AEH801 and a synthetic DNA 
NcoI/SacI fragment of the following sequence: 
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SEQIDNO:35: 

S-CATGGATCATATGTTAACAGATATCAAGGCCTGACTGACTGAGAGCT- 
3* 

SEQ ID NO: 36: 

. 5 3^- CTAGTATACAATTGTCTATAGTTCCXjGACTGACTGACTC -5\ 

The resulting plasmid, pMGl, allows the insertion of DNA fragments 
after the first 81 amino acids of NSl in any of the three reading frames within the 
synthetic linker fragment followed by termination codons in all three reading 
frames. 

10 

Plasmid pMGl, described above in Example 2, was digested with 
Ncol and Xbal, releasing a 54 bp fragment, which was discarded. Plasmid 
pMS3H3HA, described in Example 1 above, was digested with Hhal and Xbal, and 

15 a 701 bp fragment containing the coding sequence for the HA2 subunit of influenza 
strain AAJdom (H3N2) was isolated, as illustrated in Fig. 1 [SEQ ED NO: 1]. 

Synthetic oligonucleotides were annealed to generate an Ncol 5' 
overhang sequence (at the 5' end) and a Hhal 3' overhang sequence (at tiie 3' end). 
The sequence of these oligonucleotides is as follows: 

2 0 SEQ ID NO: 37: 5 -CATGGGCGCCCATATGGGCATATTCGGCG-3' 
SEQ ID NO: 38: 3*- CCGCGGGTATACCCGTATAAGCC-5'. 
The annealing reaction was performed as follows. The annealing mixture was made 
up of 2.5^L each of 5* oligo (1.3 )ig/liL), the 3' oligo (1.2 M-g/^L), and added water 
(15 |xL) to a final volume of 20 pL. The reaction tubes were then placed in 4 mL 

25 culture tubes containing water which had been heated to 65^C for 10 minutes and 
allowed to cool down slowly. The tubes were then put on ice and used immediately 
for ligation. 

This three part ligation generates pMGlH3HA2(i.221) [SEQ ID 
NO: 9] which codes for the first 81 amino acids of NS 1 fused to four amino acids 
30 donated from the linker and amino acids 1-221 of the HA2 subunit. This sequence 
is illustrated in Fig. 2 [SEQ ID NO: 9 & 10]. This molecule is also designated 
NSl(i-8i)H3HA2(i.221) [SEQ ID NO: 9 & 10] or H3C13. 

EXAMPLE 4 - NSl£ i,gi 2H3HA2£ gj ^ FSEO ID NO: 1 1 & 121 
35 pMS3H3HA. described in Example 1 above, was digested with 

EcoRI and end-filled (Klenow). Subsequentiy, the vector was digested with Xbal. 
A 487 bp fragment, which contains the coding sequence for amino acids 77-221 of 
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the HA2 subunit, was isolated and ligated to the Hpal and Xbal sites of pMGl. The 
resulting vector codes for a fusion polypeptide containing amino acids 1-81 of NSl 
fused to amino acids 77-221 of the HA2 subunit This molecule has been termed 
NSl(i,81)H3HA2(77.221) and is Ulustrated in Fig. 3 [SEQ ID NO: 11 & 12]. 

5 

EXAMPLE .VpMOdZft^HA? 

To derive a vector similar to pMGl (described in Example 2), which 
contains the codiiig region for the first 42 amino acids of NSl rather than the first 
81 amino acids of NSl, pMGl was digested with BamHI and Ncol and ligated to 

10 the BamHl/NcoI fragment encoding amino acids 2 to 42 of NSl from PNSI42TGF 
a. pNSl42TGFa is derived when pASlAEHSOl is cut with Ncol and Sail and 
ligated to a synthetic DNA encoding human TGFa as an Ncol/Sall fragment. 
pNS l42TGFa encodes a protein comprised of the first 42 amino acids of NS 1 and 
the mature TGFa sequence. The NS 1 portion of pNS l42TGFa contains an amino 

1 5 acid change from Cys to Ser at amino acid 1 3. 

The resulting plasmid, termed PMO42A, was then modified to 
contain an alternative synthetic linker after the NSI42 sequence with a different set 
of restriction enzyme sites within which to insert foreign DNA fragments into the 
three reading frames after the NS I42. This linker has the following sequence: 

20 SEQ ID NO: 39: 
5'- 

CATGGATCATATGTTAACAAGTACTCGATATCAATGAGTCACTCAAGCr- 
3' 

SEQ ID NO: 40: 

25 3'- CTAGTATACAATTGTTCATGAGCTATAGTTACTCACTGACT -5'. 
The resulting plasmid is called PMG42B. This vector is needed to contain the 
neomycin phosphotransferase- 1 (NPT-1) gene which confers kanamycin resistance. 

As described in Shatzman and Rosenberg, Met Enzvmol, , 152:661- 
673 (1987), pOTS207 is a pAS derived cloning vector which carries the kanamycin 

3 0 resistance gene from Tn903 [Berg et al. Microbiology , ed. D. Schlessinger, pp. IS- 
IS, American Society for Microbiology (Washington, DC 1978); Nomura et al. The 
Single-Stranded DNA Phapps ed. D. Denhardt et al, pp.467 -472, Cold Spring 
Harbor Laboratory (New York 1978); Castellazzi et al, Molecul. Gen Gei^M 
112:21 1-218 (1982)]. It was constructed by digesting plasmid pUC8 [Yanisch- 

35 Perron et al, SgQg, 22:103-1 19 (1985)], with BamHI and ligated to a BcII fragment 
containing the kanamycin gene from Tn903. The resulting plasmid, pUC8-Kari, 
was digested with EcoRI and PstI, and the fragment containing the kanamycin gene 
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was inserted between the EcoRI and PstI sites of pOTSV [Shaizman and Rosenberg, 
cited above]. The resulting plasmid is pOTS207. 

The pOTS207 was digested with EcoRI and PstI, and the 1467 bp 
fragment containing the kanamycin resistance gene was isolated. Synthetic 
5 oligonucleotides: 

SEQ ID NO: 41: 5' AATTCGTACCTA 3' 

SEQ ID NO: 42: 3' GCATGGATCTAG 5' 

were made to link the NPT-1 gene to pMG42B vector PMG42B was digested widi 
Bgin and Psd. The EcoRI/PstI NPT-1 gene fragment and the synthetic oligo linker 

1 0 were ligated to the digested PMG42B. The resulting plasmid, pMG42Kn allows 
fusions, in three different reading frames, to the NSm2 gene, while allowing 
antibiotic selection with kanamycin. 

Plasmid pBHA is a pBR322-deTived vector, containing the complete 
nucleotide sequence of the HA gene of a Type B influenza virus (B/Lee/40), It is 

1 5 described by Krystal et al, Proc. Natl. Aca d. Sci. USA . 22:4900-4804 (1982). 
pBHA was digested with Rsal and a 813 bp fragment containing the HA subunit 
was isolated. This fragment was ligated into plasmid pMG42Kn (described above) 
that had been digested with Seal. During the cloning, a nucleotide base (T) was 
deleted from the Seal recognition site shifting the gene out of the reading frame. 

20 The vector was digested with Ncol. and fiUed-in using Klenow, putting the gene 
back into the reading frame. 

The resulting construct, pMG42BLHA2 [SEQ ID NO: 14], expresses 
a fusion polypeptide containing amino acids 1-42 of NSl and 41-233 of the HA2 
subunit. This construct contains die Cys to Ser change at amino acid 13 of the NSl 

2 5 portion of the fusion peptide. 

In preliminary studies with this construct, vaccinated laboratory mice 
demonstrated protection from challenge with Type B influenza in the absence of 
neutralizing antibody for the virus. 

30 EXAMPLE 6 ■ PREPARING SRRD VIRUS AND RA ISING ANTISRR A 

The seed virus, A/Udom, was prepared according to the procedures 
described in P. Palese and J. Schubnan, YimL 52:227-237 (1974). Briefly, this 
technique is as follows. 

Influenza virus strain A/Udom was inoculated in lO-day old . 
35 embryonated hen's eggs into the allantoic cavity. The eggs were incubated for 
24-48 hours at 35X then chilled at 4®C overnight. A portion of the eggshell over 
the airsac was removed and the allantoic fluid was aseptically removed using a 
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10-ml syringe. The fluid was centrifuged at low speed (3.000 x g) to remove 
particulates. This clarified supernatant was centrifuged at high speed using an 
SW28 Beckman rotor at 27,000 rpm (4*^0 for 90 noinutes), resulting in the virus 
pellet. The virus was resuspcnded in 10 mM Tris (pH 7.5) containing 100 mM 
5 NaCl, 1 mM EDTA and repelleted as before. The virus was layered on 30-60% 
sucrose gradient in 1 mM EDTA (NTE) and spun for 3-5 hours at 25,000 rpm. The , 
band in the middle of the tube was withdrawn, diluted in NTE and centrifuged at 
27.000 rpm for 90 minutes. The pellet was suspended in phosphate-buffered saline 
(PBS). These viral particles were used as immunogens for preparation of antisera. 
1 0 Antisera was prepared as follows. 100-200 micrograms of purified 

virus in complete Freund's adjuvant was injected into the subscapula of a New 
Zealand White rabbit. A second injection in incomplete Freund's adjuvant was done 
4 weeks later, and the animals were bled and antisera collected 7-10 days later. 

15 EXAMPLE 7 ■ EXPRESSION OF H3HA2 FUSION PROTEINS 

A. miMnH3HA2nz22i} rspo p nq; ? a m 

The plasmid pMGlH3HA2(i-221) [SEQ ID NO: 9] was transfected 
into E. coli strain AR58 [SmithKline Beecham Phamiaceuticals]. Cultures were 
grown at 32^C to mid-log phase at which time cultures were shifted to 39.5^C for 2 

2 0 hours. The £. soli cell pellets containing the recombinant polypeptide were then 

stored at -70^C until used. 

Production of the NSl(i.81)H3HA2(i.221) protein [SEQ ID NO: 
10] was confirmed by Western blot analysis [Towbin ei al, Proc. Nad. Acad. Sci. 
U.S.A. . 26-4350 (1979)] using antisera prepared against AAJdom virus, as described 
25 in Example 5. A major immunoreactive species was found at a molecular weight of 
35,050 daltons. 

B. MS ln-«l^H 3HA 2r77^22n fSgOBDNO; XI & 12] 

The plasmid encoding the NSl(i-8i)H3HA2(77-221) peptide [SEQ 
ID NO: 12] was expressed as described in pan A above. Production of this peptide 

3 0 was confirmed by Western blot analysis, as described above. A major 

immunoreactive species was found at a molecular weight of 26,697 daltons. 

EXAMPLE 8 - PARTIAL PUREFICATION OF H3HA2 FUSION PROTEINS 

fiOli cell pellets containing the recombinant polypeptides, prepared 
35 as described in Example 6, were stored at -70*'C until used. £Qli cells were 
thawed and resuspended in lysis buffer A (50 mM Tris-HCl, 5% glycerol, 2 mM 
EDTA and 0.1 mM DTT, pH 8.0) at 10 mL/gram. The stirred suspension was then 
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treated with lysozyme (0.2 mg/mL) for 45 minutes at room temperature and 

sonicated 2x for 2-3 minutes each time by a Sonicator. The resultant suspension 

was treated with 0.1% DOC for 60 minutes at 4''C, then centrifuged at 25,000 x g. 

The pellet was resuspended by sonication in 50 mM glycine pH 10.0, 5% glycerol 2 
5 mM EDTA and then the suspension was treated with 1% Triton X-100 [J.T. Baker 

Chemicals Co.] at 4'*C for 60 minutes and centrifuged as above. 

The resulting pellet was solubilizcd in 50 mM Tris, 8 M urea, pH 8.0 

and centrifuged to remove any insoluble material. This solubilized material is 

dialyzed against 10 mM Tris, 1 mM EDTA, pH 8.0 followed, 
1 0 again, by centrifiigation of insoluble material. The solubilized material is 

designated as "crude" material and is used in in vitro and in vivo mouse assays. At 

this point, the material is approximately 40 - 50% pure. 

The "crude" material was electrophoresed through an SDS-PAGE and the 

appropriate H3HA2 protein bands were visualized by KCl staining according to D. 
1 5 Hager et al. Anal. Biochem . lffl2:76-86 (1980). The band was cut-out and eluted 

electrophoretically by the "S&S Elutrap Electro-Separation System" [Schleicher & 

Schuell]. The electro-eluiing buffer was the Tris-glycine. A concentrated and 

eluted sample was obtained and exhaustively dialyzed against 0.01 M NH4HCO3 

and 0.02% SDS [M. Hunkapiller et al. Method. Enzvmol. . 21:227-236 (1983)]. 
2 0 This sample was frozen quickly by dry ice and lyophilized to complete dryness. 

The lyophilized material was brought back into solution using SO mM Tris pH 8.0 

and used for in vitro and in vivo mouse assays. 

Following this gel elurion step, tiie protein is usually greater than 

75% pure. 

25 

EXAMPLE 9 > CONSTRUCTION OF POTS2Q8 VECTORS 

pOTSV is described in Devara £Lal, 43-49 (1984). 

Briefly, this vector is a pASl derivative with t-oop insened at the Nrul site and a 
synthetic oligonucleotide encoding Sad, Xhol and Xbal restriction sites insened at 
30 the Sail site (which is destroyed). 

A. jimsm 

pQTS208 was prepared by digesting pOTS V with EcoRI and Seal, 
followed by fill in reaction using Klenow. Tn5 Plasmid DNA [described in R. 
Jorgensen et al., MoL Gen. Genet.. 122:65-72 (1979)] was digested with Hindm 
35 and Smal, followed by a fill in reaction using Klenow yielding a 1323 bp fragment 
encoding for neomycin phosphotransfcrase-2 gene (NPT-2). This fragment is 
described in detail in Rothstein et al., Cell . lS:795-805 (1980) and Jorgensen, cited 
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above. This fragment and the above digested vector were ligated together to create 
pOTS208, which is kanamycin resistant. 
B. pOTS2Q8H3Cn 

pMGlH3HA2(l-221) (Example 3) was digested with BamHI and 
5 Xbal, Teleasing two fragments: an 806 bp BamHI fragment and a 1 60 bp 

BamHyXbal fragment. These fragments wgether code for NSl(i.8i)H3HA2(i. 
221> A three pan ligation between the two fragments and BamHI/Xbal digested 
pOTS208 (part A) yield pOTS208H3C13 which utiUzes NPT-2 for kanamycin 
resistance. 

10 C DOTS208NS181NCO 

pOTS208H3C13 (part B) was digested with BgUI. pSelect 
[Promega] was cut with BamHI and ligated with the Bglll fragment, resulting in 
pSeIectNPTni02. Transformation into E. colt JMlOl [ATCC E. coli 33876] was 
followed by selection on kanamycin and tetracycline plates. KanR was conferred by 

15 the NPT2 region from pOTS208H3Cl 3. Some lambda sequence was also on the 
Bgin fragment. OUgo 4852, SEQ ID NO: 49 GCATCGCCATGAGTCACGACG. 
was used to mutate die Ncol site to CCATGA in pSelectNPTni02, resulting in 
pSdcctNPTII102-8. This vector was cut widi BstEII and BssHII. pOTS208H3C13 
was cut with BstEII. BssHII and Sphl. and fi-agment exchange geneiated 

2 0 POTS208NS 1 81H3HA2.26, This clone has the Ncol site of NPT2 mutated. 

pOTS208NS181NS181H3HA2.26 was cut with Ncol and Sail, filled in and ligated 
with Linker 1041 [New England Biolabs] to insert a Kpnl sit« and regenerate the 
Ncol site. This step also deletes the H3C1 3 region. The unique Xbal site of the 
parent pOTS208 vector is downstream of the deletion. The resulting vector is 
25 pOTS208NS181Nco. 

EXAMPLE 10 - MODIFICATION OF G HNE ENCODTNG H3HA2 FITSTON 

PROTEIN 

In order to increase yield of the H3HA2 protein, alent mutations to 
certain rare arginine codons were made to the coding sequence of the H3HA2 

3 0 protein. These nucleotide changes resulted in no change in the protein sequence. 

A mutant H3C13 protein was prepared by mutating the nucleotide 
sequences of the fusion protein prepared according to Example 3 above. Site 
directed mutagenesis using the Altered Sites System [Promega Corporation] 
according to the manufacturer's directions was used to change nucleotide numbers, 
3 5 622, 625, and 634 (A to C) and 624, 627, and 636 (G to T) of nucleotide sequences 
[SEQ ID NO:9] encoding the NSl(i.8l)H3HA2(i.221) fusion protein of Fig. 2 
[SEQ ID NO: 10], thereby changing the codons at these regions from AGG to CGT. 
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both encoding Arg. These changes correspond to nucleotide numbers 367» 370, and 
379 (A to C) and 369. 372, and 381 (G to T) of the HA2 fragment of Fig. 2 [SEQ 
ID NO: 58]. 

Fig. 2 illustrates the modified nucleotide sequences of the fusion 
5 protein [SEQ ID NO: 10] by contrast with the nucleotide sequence [SEQ ID NO: 9] 
of the "unmodified'* fusion gene (nucleotide changes above sequences of unmodified 
gene). Mutagenesis on this sequence was carried out according to the method 
provided with the pSelect kit from Promega, 

A. Ml(i-8nH3HA 2ri-22n FSEQ IP NQ; m 

1 0 Briefly, cloning for the mutagenesis was performed as follows. The 

pSelect plasmid [Promega] and pMGlH3HA2 (Example 3) were each digested with 
liifljini. These two plasmids were ligated together and selected on tetracycline 
plates. The resulting vector is pSelH3HA2. Mutagenesis was performed according 
to Promega*s kit. The following ohgonucleotide was used: SEQ ID NO: 43: 

1 5 5 -AAACTGTTTG AA AAAACAOG TCGTCAACTG CGTGAAAATG 
CTGACGACAT GGGC -3'. 

Qones were verified by restriction endonuclease HincH . The 
resulting plasmid, pSelH3HA2mut5585 was digested with Ncol and JCbal, and a 748 
bp fragment coding for the H3HA2mut5585 polypeptide was isolated. 

2 0 pOTS208NS181Nco (Example 9C) was digested widi JjsQl and 

222al. The ligation of linear pOTS208NS 181Nco and the 748 bp fragment resulted 
in pOTS208NS lH3mut5585 [SEQ ID N0:7]. This vector codes for the 
polypeptide, NSl(i.8i)H3HA2(i.221) [SEQ ID NO:10]. 

B. Egression of mutated gene encoding H 3C13 protein 

25 The plasmid of A was transfected into E, coli strain AR58 

[SmithKline Beecham], Cultures are grown at 32'^C to mid-log phase at which time 
culnires are shifted to 39.5**C for two hours. The E. coli cell pellets containing the 
recomWnant polypeptide are then stored at -70''C until used. Production of the 
NSl(i.81)H3HA2(i-221) i>rotem [SEQ JD NO:10] is confirmed by Western blot 

30 analysis [Towhin etal., Proc. Natl. Acad. Sci. U S A . 2fi:4350 (1979)] using 
antisera prepared against A/Udom virus, as described in Example 4. A major 
inamunoreactive species is expected at a molecular weight of approximately 35,00 
daltons. 

The expression levels obtained are about 50-100% higher Uian those 
35 obtained by the expression of the unmodified coding sequences in the same 
expression system. 
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pSelH3HA2mul5585 (part A) was subjected to site-directed 
mutagenesis, as described above. Oligo SEQ ID NO: 44 
TGTGACAATGCTK}CATCCX}TrCAATCCGTAATCK}TAOT 
5 TGATG, was used and clones were verified by restriction cndonuclease Rsal. The 
resulting plasmid, pSelH3HA2mut2 was digested with Ncol and Xbal, and an 
^proximately 748 bp fragment encoding for the H3HA2mut2 polypeptide was 
isolated. pOTS208NS181Nco was digested with Ncol and Xbal. The ligation of 
linear pOTS208NS181Nco (Example 9C) and die 748bp fragment resulted in 
10 pOTS208NSlH3mut2. This vector codes for the NSl(i-81)H3HA2(i.221) 
polypeptide [SEQ ID NO: 10]. 

Example 11 ^PLASMTDpD 

Plasmid pASl_EH/801 (described above in Example 2) was cut with 
1 5 Bgin, end-filled with DNA polymerase I (DNApolI; Klenow), and ligated closed, 
thus eliminating the Bgin site. The resulting plasmid pBgl' was digested with Ncol, 
end-filled with DNApoU (Klenow), and ligated to a Bgin linker. The resulting 
plasmid, pB4, contains a Bgin site within the NSl coding region. Plasmid pB4 was 
digested with Bgin and ligated to a synthetic DNA linker of die sequence: 

20 SEQ ID NO: 45: 5'-GATCCCGGGTGACTGACTGA -3' 

SEQ ID NO: 46: 3'- GGCCCACTGACTGACTCTAG-5'. 

The resulting plasmid, pB4+, permits insertion of DNA fragments 
witiun the linker following the coding region for first 81 amino acids of NSl 
followed by temiination codons in all three reading frames. Plasmid pB4+ was 

25 digested with Xmal (cuts witiiin linker), end-filled (Klenow), and ligated to a 520 
base pair PvuII/Hindin, end-filled fragment derived ftx)m the HA2 coding region. 
The resulting plasmid, pD, codes for a protein [SEQ ID NO: 1 8] comprised of the 
first 81 amino acids of NSl, three amino acids derived from the synthetic DNA 
linker (Gln-Bc-Pro), followed by amino acids 65-222 of the HA2. 

3 0 Expression is obtained by transfecting pD into a desired E. coli 

strain, preferably LW14, using standard techniques. Purification may be by 
standard techniques or, preferably, as described in Example 18 below. 



EXAMPLE 12 - H3 SUBTYPE HETEROLOGOUS PROTE mON ELTCTTRn RY 
35 VACCINATION\\TmNSl( Ufti )H3HA2f i.^oi )rSEOroNO:m 

Mice (NIH/Swiss; 15 per group) were vaccinated subcutaneously 
with 50 or 10 ^ig NSl(i.81)H3HA2( 1-221) [SEQ ID NO: 9 & 10] in aluminum 
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hydroxide on days 0 and 21. The mice were boosted intraperitoneally on day 42 
with the protein without adjuvant. On day 47, mice were challenged intranasaliy 
with 2 - 3 LD50 doses of either A/PR/8/34 (HlNl) or A/HK/68 (H3N2) virus, and 
survival was monitored through day 21. This represents a heterologous challenge 
5 (A/PR/8/34) and an H3 heterosubtypic challenge, since the NSl(i.8i)H3HA2(i. 
221) construct [SEQ E) NO: 9 & 10] was derived from A/Udom/72 cDNA. TTie 
control group received adjuvant (CFA) only. 

The results in Table 1 below show that survival in mice vaccinated 
witii NSl(i.81)H3HA2(i-221) [SEQ ID NO: 10] and challenged widi A/HK/68 

10 (80-93%) was significantly higher than in control mice which were injected with 
adjuvant only (26% survival). In contrast, vaccination with NSl(i-81)H3HA2(i- 
221) [SEQ ID NO: 10] did not confer protection against challenge with A/PR/8/34, 
an HlNl strain (0-26% survival). Thus protection elicited by NSl(i.81)H3HA2(i. 
221) [SEQ ID NO: 10] is selective for antigenically diverse virus strains within the 

15 H3 subtype. 

Likewise, vaccination with the D protein (NSl(i-8l)HA2(65.222) 
[SEQ ID NO: 18]^ derived from the HlNl subtype) elicits protection from 
heterosubtypic challenge with HlNl, but not the H3N2 subtype [S. Dillon et al. 
Nature. (1992); Mbawuike et al, Faseb. J.. 5:A1362 (abs, 5749 and Table 1)]. These 
2 0 results in outbred mice also suggest that the response to the HI and H3 proteins will 
not be restricted to a* limited number of individuals with certain major 
histocompatibility alleles, and therefore the vaccine will be effective in a majority of 
individuals. 

Table 1 

25 Percent Survival After Challenge: 



30 



Immunization HA A/PR/8/34 A/HK/68 

Subtype (HlNl) (H3N2) 



50^gNSl(i-81)H3HA2(i.221) H3 26 80* 



10^gNSl(i.81)H3HA2(i.221) H3 0 93* 



35 10^gNSl(i-81)HA2(66-222) HI 67* 13 
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A/HK/68 virus H3 60* 100* 



Control - 0 26 

5 p ^ 0.05 vs. control in Pisbers exact probability test 
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Vaccination of mice with live homologous (A/HK/68) virus provided 
complete or partial protection, reflecting protection mediated by neutralizing 
antibody (homologous H3N2 challenge) and/or CTL (heterologous HlNl 
challenge), respectively. 
5 Duration of protective immunity was tested by immunizing mice 

subcutaneously with the recombinant influenza protein plus adjuvant on days 0 and 
21. Some mice were also given an ip injection of the protein 
(without adjuvant) on day 42. Mice were challenged with A/HK/68 (H3N2) on day 
47, four weelcs after the second injection. Control mice were immunized as 
1 0 described above for Table 1 , where an ip injection was given at week 6 (5 days prior 
to challenge). The results in Table 2 show that CB6F1 mice (15 per group) were 
significantly protected when challenged with the A/HK/68 heterologous H3 virus 
strain 5-28 days after the last injection. 

Table 2 

15 

Dose (\ig per injection) Injection Percent 

Qf NS ]i f 1 -8 1 ^ H3HA 2( 1 ,22 1 Adiuv^ t Schgdple Spryjvftl 





50 Jig 


CFA 


0^1 


86* 


20 


50 


CFA 


0^1,42 


He 

100 




, Ong 


CFA 


0^1 


6 




50 


Al+3 


0^1 


93* 




50 Jig 


Al-^3 


0,21,42 


93* 


25 


0\ig 


Al+3 


0,21 


0 



*p ^ 0.05 v. control in Fisher's exact probability test 

EXAMPLE 13 - TYPE A CROSS-PROTECTION WITH D AND H3C13 

30 Pli^OTCIN 

Mice (CB6F1) were divided randomly into six groups, with fifteen in 
each group. The mice were injected subcutaneously with proteins in Al"*"^ (100 jxg) 
on days 0 and 21, and then were challenged with 2-3 LD50 doses of virus on day 
49. Survival was monitored through day 21. The results of this study are illustrated 
35 in Table 3 below. For convenience, NSl(i-8i)H3HA2(i-221) [SEQ ID NO: 10] is 
referred to as H3C13 in the table below. 
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Table 3 

Percent Survival After Challenge with: 



HA 



AyPR/8/34 A/HK/68 



1. 50^gH3C13 H3 

SOngD HI 



73 



73 



10 2. 10^gH3C13 H3 

lO^lgD HI 



67' 



100 



15 



3. l^gH3C13 H3 
1 Jig D HI 

4. 50jigH3C13 H3 



86 



73 



73 



20 6. 



SO^gD HI 



Al"*"^ control 



47 



*p ^0.001 



vs. control group 

** p ^ 0.03 vs. control group 

This data demonstrates that mice immunized with a mixture of the D 

25 protein and H3C13 protein in aluminum adjuvant were protected against challenge 
with either A/PR/8/34 (HI) or A/HK/68 (H3) virus. In contrast, mice immunized 
with the D protein were protected against HI but not H3 challenge. Likewise, mice 
immunized with the H3C13 protein were protected against the H3 but not the HI 
challenge. Therefore, the combination of the D protein and the H3C13 proteins 

3 0 elicited protection against the cuirently circulating subtypes of influenza A virus. 
Thus, this combination represents a subtype cross-protective vaccine. 

EXAMPLE 14 - CREATION OFpEAl«1 KNRBS3 VECTOR 

pMG 1 (Example 2) and pMG42Kn (Exanq)le 5) were both digested 
35 widi BamHI and Ncol. A 236 BanoHI/NcoI fragment containing the coding 

sequence for amino acid sequence spanning residues 1 to 81 of the NSl gene was 
isolated from pMGl. The digested pMG42Kn and the 236 bp fragment were ligated 
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together and iransformants were selected on LB and kanamycin agar plates. The 
resulting vector pMG181Kn(cn) maintains all regulatory elements of pMG42Kn 
except the NSl (aal-42) sequence is replaced with the NSl (aal-81) sequence. 

pMG181Kn(cn) described above was digested with BstXI and 
5 BamHI. The following linker encoding ribosome binding site (RBS3) is cloned in 
the digested vector, replacing the cIIRBS. The linker sequence is: 

5* TAAGGAGGATATAACATATG [SEQ ID NO: 47] 
y TGGAA TTCCTCCTATATTGTAtA CCTAG 5' [SEQ ID NO: 48]. 
BstXI BamHI 

1 0 The resulting vector is pMGl 81KnRBS3. 

To generate pEA181KnRBS3, a 1.2 kb EcoRI/Bglll fragment from 
similarly digested pOTS V containing the lambda rexArexB region was cloned into 
mpl8 [Gibco/Bethesda Research Labs] and mutagenized to create silent mutations 
in the two Ndel sites in this region. The mutations were CATATG to CATGTG in 

15 both sites, OnesiteisintherexAandtheotiierintherexB. The mutagenized 

fragment was insened into both EcoRI/Bglll digested pMG181Kn(cII) and similarly 
digested pMG181KnRBS3, resulting in the plasmids pEA181Kn(cII) and 
pEA181KnRBS3, respectively. pEA181KnRBS3 has the useful properties of the 
pMG vectors, plus the additional attribute of nalidixic acid induction. 

20 An EcoRI/PstI fragment containing the ampR gene of pBR322 was 

tiien inserted into EcoRI/PstI digested pEA181Kn(cn) and pEA181KnRBS3 to 
create pEAlSlCUamp and pEA181RBS3amp, respectively. These plasmids are 
rexB+ and should be nalidixic acid inducible* in contrast to pMGl and its 
descendants, which are rexB- and cannot be induced with nalidixic acid. The 

2 5 mutant EcoRI/Bglll region was functionally examined by cloning it into a pMG 1 

vector carrying galK and demonstrating induction of galK with nalidixic acid. 

EXAMPLE 15 - CREATION OF VECTOR FOR PRODU CTION OF NR\^ 
fiIlSiiA2£L222i 

3 0 Plasmid pOTS208BLeeHA2 was created as follows. An EcoRI 

fragment encoding the B/Lee HA region from plasmid pBHA (Example 5) was 
cloned into pSelect to generate pSelectPBHAS2. Site-directed mutagenesis inserted 
an Ncol site at the stan of HA2, resulting in an N-terminus: MET GLY PHE PHE, 
and a C terminus of SER ILE CYS LEU. The resulting construct is called 
35 pSelectPBHAS2-Bl. This plasmid was cut with Ncol and Xbal (a site in Ae 
polylinker of pSelect downstream of the HA gene), and ligated into Ncol/Xbal 

digested pEA181KnRBS3, described above, to generate pEA181BLeeBl-l. A 

.39. 
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BainHI/EcoRI, filled in, fragment was cut out of pEA181BLeeBl-l and li gated into 
pOTS208 (Example 9A), that had been digested with Xbal. filled in. and BamHI. 
The EcoRI and Xbal sites were regenerated by the ligation. This extra cloning step 
was necessary because th^ was no convenient cloning site to fuse the gene to 
5 NS(i-81) in pOTS208. pEA181KnRBS3 (described above) and pOTS208 have 
unique BamHI and Xbal sites to facilitate fragment exchanges. 

A BamHI/Xbal fragment of about 101 1 bp encoding the NS(i- 
81)BLHA2(i.223) sequence from plasmid pOTS208BLeeHA2 was isolated and 
ligaied into vector pSelect-1 [Promega], which was also digested with BamHI and 
1 0 Xbal. The resulting construct is called pSelBC13. This vector contains the coding 
sequence for NSl(i-8l)BHA2(i.223). also termed BC13 [SEQ ID NO: 57]. 

EXA MPLE 1(7- CREATION OF VECTOR FOR PRODUCTION OF BnW2 
Mutagenesis was carried out on the pSelBC13 using Promega*s 

1 5 protocol and oligonucleoride 5492, SEQ ID NO: 50 

GGAC3GATGGGAAGGACTCATTGCAGGTTGG. This mutagenesis changed the 
ATGcodon vidthin the HA2 portion of the molecule to CTC (MET to LEU). The 
resulting plasmid is called pSelBC13mut5492. This plasmid was then digested with 
Ncol and Xbal, releasing a digestion fragment encoding for HA2, and ligated into 

20 pOTS208NS181Nco (Example 9C) that had been digested with Ncol and Xbal. 
The resulting construct, pOTS208NSlBLHA2mut5492 codes for the same 
polypeptide of pOTS208BLeeHA2. (i.e. BC13), except the internal translation stan 
is eliminated at amino acid position 98 of the fusion protein. This protein is NSl(i. 
8l)BHA2(i.223)(met-lcu) [SEQ ID NO: 55]. 

25 A Hindin fragment of approximately 1 kb encoding NS 1 (amino acid 

residues 7-81) and BLce HA2 (amino acid residues 1-223) and which contained the 
MET to LEU changes fi^m plasnaid pOTS208NSlBLHA2mut5492 was isolated. 
This fragment was ligated into the Hindlll site of vector pSelect- 1 , resulting in 
pSelBC13mut5492. Mutagenesis was carried out using Promega's protocol and the 

3 0 following oligos 5920, 5921 and 5939, respectively: 
SEQ ID NO: 51 

CrCrGCrGTAGAAATCGGTAACGGTTGCTTTGAAACCAAAC 
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SEQIDNO:52 

GGTTTCTTGGAAGGTGGTTGGGAAGGTCTCATTGCAGGTTGGCACGG 
SEQIDNO: 53 

GCTTTCCAACGAAGGTATCAATCAACAGTGAAGACGAGCATCTCn^ 
5 This mutagenesis created the following silent codon changes in the 

HA2 region: 

The codons for GLY at positions 93, 94, 97, 187, 215, and 217 were 
each mutated fiom GGG to GGT; the codons for ILE at positions 188, 189, and 214 
vfert each changed from ATA to ATC; the codon for ASP at position 193 was 

1 0 changed from GAT to GAC; and the codon for ASN at position 216 was changed 
from A AT to AAC 

The resulting plasmid was called pSelBC13mut2. This plasmid was 
then digested with Ncol and Xbal. releasing a fragment of about 775 bp encoding 
for HA2. This fragment was ligated into pOTS208NS181Nco (described above), 

1 5 that had been digested with Ncol and Xbal. The resulting construct, 

pOTS208NSlBLmut2 (see Fig, 5 [SEQ ID NO: 54]). codes for die same 
polypeptide [SEQ ID NO: 55] as pOTS208NSlBLHA2mut5492, except for the 
silent codon changes. 

20 EXAMPLE 17 - EXPRESSION OF FUSION PRQTRTN 

pOTS208NS lBLmut2 [SEQ ID NO: 54] is transfected into a suitable 
host cell, preferably an £. coli strain and expressed essentially as described for the 
/ H3 proteins described above. Strain LW14 is a derivative of £. coli K-12 strain 
W31 10 [ATCC E. coli 27325]. The transducing phage PI [£. coli ATCC 25404- 

25 Bl] was grown on E. coli K-12 strain AR58, described above, the genotype of 
which is tiir-galE::TnlO X0857 bio-uvrB- rpsL. Phenotypically, strain AR58 
requires threonine, biotin foir growth, is sensitive to UV light and DNA damaging 
agents, cannot use galactose as a carbon source, and i$ resistant to streptomycin. 
Strain W31 10, a prototroph* is incubated witii the phage and plated onto a medium 

3 0 containing tetracycline to select for die transduction of die TnlO element. The PI 
phage picks up die segment of DNA containing die TnlO and brings widi it Ae X 
a857 Wo- uvrB-. Thus the strain LW14 has the following genotype: galE::TnlOX 
a857 bio- uvrB-. Phenotypically, strain LW14 requires biotin for growth, is 
sensitive to UV light and DNA damaging agents, and cannot use galactose as a 

35 carbon source. 
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EXAMPIJE 18 - PURIFI CATION OF BC13mut2 

£. coll whole cells transformed with the pOTS208NSlBLmut2 
plasmid [SEQ ID NO: 54] as described in Example 16 above were recovered after 
fermentation by centrifugation or tangential flow filtration, washed to remove 
5 media, and stored at -70^C until use. 

A. Step 1; ILysjs an<i ggnmftfgatiQn qspiarion) 

£. call cells, 500 gm wet cell weight (WCW), were thawed and 
suspended in 4-7 volumes (2L) of buffer containing 0.025 M Tris-HCl, 0.005 M 
EDTA, pH 8.0. Chicken egg lysozyme (Calbiochem; suspension at 100 mgAnL) 
1 0 was added to a final concentration of 1 g/L and the preparation stirred with a 
Tekmar mixer at room temperature for 1 hour. 

The lysate was centrifuged at 15,000 x g for 1 hour at 4^C and die 
supernatant discarded. The pellet (PI) was resuspended in 5 mL per gram of 
original wet cell weight of buffer consisting of 0.025 M Tris-HCl, 0.002 M MgCl2, 
15 pH 8.0 (about 2.5L). 

The yield of this step was 90-100% by SDS-PAGE analysis, and 65- 
100% by RP-HPLC for product. 

B. Step 2: Nuclease digestion and extraction 

The preparation was treated with benzonase to digest nucleic acids, 
2 0 then extracted with nonionic detergents to reduce the levels of E. coli contaminants 
in the pellet. Benzon nuclease, 0.2 mL per L of suspension, was added to the 
suspension, which was then stirred at room temperature for 1 hr. The sample was 
diluted with one volume of cold water containing 2% w/v Triton X- 100 and 0.2% 
deoxycholate and stirred for 30 min at or below 15°C. Centrifugation was repeated 
25 as in step 1 and the supernatant discarded. 

C Step 3: Urea extinction 

The pellet (P2) was extracted with 5 mL/gm WCW of cold 0,025 M 
NaH2P04, 0.025 M Tris-HCl, pH 6.0, containing 4 M urea and 10 mM 
dithiothrcitol (DTT). The Tekmar was used at a very low speed to mix, and 
30 temperature held below 15T. The sample was stirred at 4X for 1 hr. then 

centrifuged as in step 1. The supernatant (S3) was discarded. The pellet (P3) must 
be stored in the freezer until use. 

D. Step 4: Solubilization, reduction, and DEAF 
chromatography 

35 The P3 pellet was solubilized and applied to anion exchange 

chromatography. This step removes remaining nucleic acid and major host cell 
proteins. P3 was suspended to 5 mL per gm WCW in .01 M Tris base, 8M urea (pH 
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not adjusted). DTT was added to 25 mM. The pH was then adjusted to 12.5 using 
6N NaOH, stirring for 15 min at room temperature, immediately followed by a 5- 
fold dilution of the same with 10 mM boric acid containing 25 mM DTT. If needed, 
the sample may be diluted to keep conductivity below 2mS/cm. The pH was 
5 adjusted to 9.0 and the sample stirred for up to 2 hour at room temperature. 

The pH 12.5 treatment was necessary to complete solubilization of 
ithje B/Lee protein. However since carbamylation may occur under these conditions, 
the rime was controlled very carefully. In addition, the pH 9 adjusted sample was ' 
unstable and cannot be held. 

1 0 The sample (no more dian 12 mg total protein per mL of resin) was 

then loaded onto a 14 x 250 cm (0.75L) DEAE Toyopearl 650M column 
equilibrated with buffer A. All steps were performed at room temperature at a 
linear velocity of 100 cm/hr. The column was washed sequentially with 2-3 column 
volumes of buffers B, C, and D, then eluted with buffer E. When protein began to 

1 5 elute from the column, flow was stopped for 15-20 minutes to improve the 

efficiency of elution of the B/Lee product; then the peak of product protein was 
collected. The column was cleaned with buffer D followed by 0.5 N NaOH. 

The yield of this step was 85-90% by SDS-PAGE or Western blot 
analysis* and was estimated at 65-70% by RP-HPLC assay for product. 

20 E. Step S: Pretreatment and reverse phase chromatography 

The buffer E eluate from step 4 was adjusted to no more than 1 g/L 
protein concentration and made 2% in SDS, 30 mM DTT, 0.1% M Tris, 5 mM 
EDTA, pH 9, then heated at either: ^O^'C for 60, 95''C for 30 min, or lOO^^C for 25 
minutes, using a heat exchanger or water bath. This treatment was necessary to 

2 5 break up aggregates and prepare the sample for RP chromatography. The sample 

was cooled to room temperamre and 2-propanol was added to 10% v/v. 

The sample was injected on an Amberchrome reverse phase column 
equilibrated in 10% 2-propanolA).2% trifluoroacetic acid (TFA)/watcr. The 
gradient shown in Table I was used to elute the column. Fractions containing 

3 0 product were analyzed by analytical RP-HPLC, pooled, and held at 4^C. The 

column was 25cm in height and was run at a linear velocity of 75-80 cm/hr at 
ambient temperamre. An Amicon Vantage column, 9 cm in diameter, was used. 
The loading capacity of the column was 2 g/L. 

The reverse phase column step has a yield of 30-80% (60-80% is 

35 typical). 
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F. Step 6: Precipimrinn 

The pH of the RP eluate was adjusted to 6.0 +/- 0.5 using 1 N NaOH. 
After 10-15 min of stirring at room temperanire, the precipitate was collected by 
ceaitrifugation at 16,000 x g for 30 min at 4''C. The precipitate was resuspended to 
5 approximately 6-8 mg/mL protein concentration in 25 mM Tris, 8 M urea. DTT 
was added to 25 mM, and the sample sdrred for 30 min at rocmi temperature. The 
pH was adjusted to 12.5 and stirring repeated for 15 min, immediately followed by 
pH adjustment to 9.0 using HQ. 

Alternately, the precipitate was suspended in buffer containing 0.1 M 
1 0 Tris-HCl, 2% SDS, 0.01 M EDTA, pH 8.0-9.0. DTT was added to 25 mM, and 
stirred 15-30 min until the solution was clear and all of the precipitate had 
dissolved. The sample was immediately taken to the next step. 
G- Step 7: Pesalting and nrBnaration of final pTTh1ii<?t i 

A 7 X 10 cm column was packed with Sephadex G25M (Pharmacia) 
15 at room temperature. It was equilibrated with 3-7 column volumes of 25 mM Tris- 
HQ, pH 9.,0, containing 5% w/v mannitol. Sample, at 6-10 mg/mL protein 
concentration, is injected on the column (20-25% of total column volume, i.e. 80r 
liOO mL per injection). The column was developed at 150 cm/hr Unear velocity and 
the product desalted into the column buffer. The final product can be stored at 4«C. 
^ ° The yield of steps 6 and 7 together was no less than 90%. 

The product of die purification process was recovered at an overall 
yield of about 20-40%, and was over 95% pure by SDS-PAGE and RP-HPLC 
analysis. The final yield is about 3 g/500 g well cell weight. 

-5 Table 4 

Gradient for RP-LG of B/Lee 









%A 


%B 


0 




80 


90 


10 


5 




80 


90 


10 


20 




80 


55 


45 


120 




80 


35 


65 


145 




80 


10 


90 


180 




80 


10 


90 


181 


0 


10 


90 





35 A: 0.2% TFA in water 

B: 99.8% 2-propanoV0.2% TFA 
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H. Pprific^Ppn Qf FlyD, Ngl n>8n HA2(65-222^| 

HuD (Example 10) may be purified in much the same manner as the 
B/Lee with the following parameter alterations. For DEAE chromatography, the 
FluD column was equilibrated in 8M urea, 50 niM Tris, 25 mM borate at pH 9.0. 
5 After the sample is loaded, sequential washes are performed with the following 
buffers: 4M urea in Tris-borate pH 9.0, 4 M urea and 0.4 M NaQ in Tris-borate pH 
9.0, and Tris-borate pH 9.0. The product is eluted with a step eluuon of 2% SDS, 
0.1 to 0.25 M NaQ, in Tris-borate pH 9.0. Prior to RPLC, the protein concentration 
is adjusted to 1 mg/mL or less, the product is heated at 95^C for 30 minutes, and 
1 0 cooled, and 2-prq)anol is added to 10% v/v. The column is then loaded. RPLC is 
then performed on Amberchrome resin, as described above for B/Lee. Up to 2*3 
mg of protein are loaded per ml of resin. The final yield is about 4 g/500 g wet cell 
weight. 

15 EXAMPLE 19 - 3^PART TNFIJT ENZA VACCINE 

A recombinant vaccine was formulated to contain 1 ^g each of the 
recombinant proteins NSl(i-8i)HA2(65-222) (Example 11), NSl(i-81)H3HA2(i. 
221)niut5255 (Example 10), and the BC13mut2 (described in Example 15 above) in 
Al"*"^ (100 p-g) plus 3-o-deacylated monophosphoryHipid A (3D-MPL) (5 |ig) 

2 0 [described in U.S. Patent No. 4,912,093; conunercially available from Ribi 
Immunochem Research, Inc., Hamilton, Montana]. Prior to inclusion in the 
recombinant vaccine, the influenza proteins were purified as described in Example 
15 above to remove any contaminating bacterial proteins, DNA, and endotoxin. 

Mice (female, CB6F1) were divided randomly into groups with 15 

25 mice per group. The mice were injected subcutaneously on days 0 and 21 with the 
recombinant vaccine. A group of control mice were injected with the same dose of 
Al/MPL without antigen according to the same schedule. Mice were challenged 
with 3-5 LD50 doses of virus on day 49 and survival was monitored through day 21 
post-challenge. In the following table showing these results, N.D. = not done and 

30 under die antigens, HI = NSl(i.81)HA2(65-222). H3 = NSl(i.81)H3HA2(i. 
221)mut5855 and B = NSl(i.8i)BLHA2(i-223)mut2. 
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Table 5 

Type A and B Cross-Protection in Mice Iimnunizcd with a Combination of 
Recombinant HA2 Antigens 

Percent Surviva] After Challenge with: 
A/Pr/Smmn A/HK/68 fH3^B/I^e/4Q TB^ 



10 



Antigen 
#1 H1/H3/B 

HI 

H3 

B 

control 



73* 

60* 

N.D. 

0 

7 



80* 

N.D. 

73* 

7 

0 



73 

N.D. 

0 



33 



1 



#2 H1/H3/B 93* 80* 100* 

HI 86* N.D. N.D. 

H3 RD. 53** N.D. 

,15 B N.D. N.D. 80* 

control 0 7 7 

* p ^0.001 vs. control group 
** p :S0.01 vs. control group 
20 ^ p > 0.05 (not statistically different than control group) 



The data in Table 5 above results from two experiments that 
demonstrate that mice vaccinated with the combination of HI, H3, and Type B HA2 
antigens were protected against all three virus challenges (HI, H3 and Type B) 

2 5 (>73- 100% survival vs. 0-7% in connrols). The HI and H3 antigens in Al/MPL 
were subtype protective when administered individually as shown in Table 5. The 
Type B construct adoninisttred without the other antigens was only protective in one 
study (Exp. 1; 33% survival vs. 0% survival in controls but protected 80% of the 
mice in a second study, Exp. 2). Thus, preliminary data shows equivocal data on 

30 the stability of the Type B construct when formulated in AI/MPL in the absence of 
the other HA2 antigens. Studies are ongoing to confirm the stalrility of the construct 
in other formulations and in hflH/Swiss mice to confirm activity in an outbied 
^stem. 

Although each antigen contains the NS 1 (i .g i ) regions from 
35 A/PR/8/34 (HI) virus, protections against HI challenge was only achieved with the 
D protein which COTtains the H1HA2 region as well. Thus, die H3HA2 and Type B 
HA2 portions of each chimeric antigen are responsible for conferring subtype- 
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specific protection. Thus, the combined HA2 constructs provide cross-protecuons 
for all currently circulating influenza Type A (HI and H3 subtypes) and Type B 
viruses. 

Survival of NIH/Swiss outbred mice inimunized with the mutant 
5 NS(i-8i)BHA2(i.223)(niet-leu) (not shown) showed activity at 100 micrograms 
(73% survival), but reduced activity at lower doses. This confirms earlier studies in 
outbred mice showing ieduced potency relative to HI or H3 constructs (which are 
active at^ 1 niicrogram per dose). In contrast, in CB6F1 inbred mice, an inverse 
dose response or no dose response is seen with NS(i.8i)BHA2(i.223)(inet-leu). 

10 

EXAMPLE 20 - PLA.SM TD pMS3H3HA 

Plasmid pFV88 contains die entire 22 1 amino acid length HA2 from 
AAJdom, an H3 subtype virus [C. J. Lai et aL, Proc. Natl. Ac ad. Sci. USA . 22:210- 
214 (1980)]. which HA2 nucleic acid sequence is illustrated in Fig. 7 [SEQ ID NO: 
15 1]. This plasmid was cut with Pst L The resulting 1 900 bp fragment, which 
contains the entire HA (HAl and HA2) fragment and some GC tailing, was then 
inserted into pUC18 [Bethesda Research Laboratories]. The resulting plasmid is 
termed pMS3 or pMS3H3HA. 

20 EXAMPLE 21 -pMGI 

Plasmid pAPRSOl is a pBR322-derived cloning vector which cairies 
the NS 1 coding region (A/PR/8/34). It is described by Young et ai, in The Origin 
of Pandemic Influenza Viruses, ed. by W. G. Laver, Elsevier Science Publishing 
Co. (1983). 

2 5 Plasmid pAS 1 is a pBR322-derived expression vector which contains 

the Pl promoter, an N utilization site (to relieve transcriptional polarity effects in 
the presence of N protein) and the cll ribosome binding site including the cll 
translation initiation codon followed immediately by a BamHI site. It is described 
by Rosenberg et al., in Methods Enzvmol.. Jiil:123-138 (1983). 

3 0 Plasmid pAS 1 AEH was prepared by deleting a non-essential EcoRI- 

Hindm region of pBR322 origin from pASl. A 1236 base pair BamHl fragment of 
pAPR801, containing the NSl coding region in 861 base pairs of viral origin and 
375 base pairs of pBR322 origin, was inserted into the BamHI site of pAS 1 AEH. 
The resulting plasmid, pASlAEH/801 expresses authentic NSl (230 amino acids). 
3 5 The plasmid has an Ncol site between the codons for amino acids 8 1 and 82 and an 
Nrul site 3* to the NS sequences. The BamHI site between amino acids 1 and 2 is 
retained. 
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Plasmid pMG27N, a pASl derivative fMol. Cell. Biol. . 5:1015-1024 
(1985)], was cut with BamHI and Sad and ligated to a BamHI/NcoI fragment 
encoding the first 81 anoino acids of NSl from pASlAEHSOl and a synthetic DNA 
NcoVSacI fragment of the following sequence: 
5 SEQIDNO:10: 

5'-CATGGATCATATGTTAACAGATATCAAGGCCTGACTGACTGAGAGCT- 
3* 

SEQIDNO:58: 

3'- CTAGTATACAATrGTCTATAGTTCCGGACTGACTGACTC -5" 
1 0 The resulting plasmid, pMG 1 , allows the insertion of DNA fragments 

after the first 81 amino acids of NSI in any of the three reading frames within the 
synthetic linker fragment followed by termination codons in all diree reading 
firames. 

15 EXAMPLE 22 -dMGIHSHA 

Plasmid pMGl, described above in Example 21, was digested with 
Ncol and Xbal. releasing a 54 bp fragment, which was discarded. pMS3H3HA, 
described in Example 1 above» was digested with Hhal and Xbal, and a 701 bp 
fragment containing the coding sequence for the HA2 subunit of influenza strain 

2 0 AAJdom (H3N2) was isolated, as illustrated in Fig. 1 [SEQ ID NO: 1]. 

Synthetic oligonucleotides were annealed to generate an Ncol 5' 
overhang sequence (at the 5' end) and a Hhal 3* overhang sequence (at the 3' end). 
The sequence of these oligonucleotides is as follows: 
SEQ ID NO: 66: 5*-CATGGGCGCCCATATGGGCATATTCGGCG-3* . 

2 5 SEQ ID NO: 67: 3'- CCGCGGGTATACCCXjTATAAGCC -5' 

The annealing reaction was performed as follows. The annealing mixture was made 
up of 2.5jiL each of 5' oligo (1.3 ^lg/^lL), the 3' oligo (1.2 Jlg/jlL), and added water 
(15 jiL) to a final volume of 20 ^iL. The reaction tubes were then placed in 4 mL 
culture tubes containing water which had been heated to 65X for 10 minutes and 

3 0 allowed to cool down slowly. The tubes were then put on ice and used immediately 

for ligation. 

This three part ligation generates pMGlH3HA2(i-221) [SEQ ID 
NO: 9] which codes for the first 81 amino acids of NSl fused to four amino acids 
donated from the linker and amino acids 1-221 of the HA2 subunit. This sequence 
35 is illustrated in Fig. 2 [SEQ ID NO: 9 & 10]. This molecule is also designated 
NSl(i.8l)H3HA2(i.221) [SEQ ID NO: 9 & 10]. 
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FXAMPLE 23 - PREPARTNG SEED VIRUS AND R AISING ANTISERA 

The seed virus, AAJdom, was prepared according to the procedures 
described in P. Palese and J. Schulman, Virol. . 52:227-237 (1974). Briefly, this 
technique is as follows. 
5 Influenza virus strain AAJdom was inoculated in 10-day old 

embryonated hen's eggs into the allantoic cavity. The eggs were incubated for 
24-48 hours at 35**C then chilled at 4^C overnight. A portion of the eggshell over 
the airsac was removed and the allantoic fluid was asepticaUy removed using a 
10-ml syringe. The fluid was centrifuged at low speed (3,000 x g) to remove 

1 0 particulates. This clarified supernatant was centrifuged at high speed using an 
SW28 Beckman rotor at 27,000 rpm (4X for 90 minutes), resulting in the virus 
pellet. The virus was resuspended in 10 mM Tris (pH 7.5) containing 100 mM 
NaCl, 1 mM EDTA and repelleted as before. The virus was layered on 30-60% 
sucrose gradient in 1 mM EDTA (NTE) and spun for 3-5 hours at 25,000 rpm. The 

15 band in the middle of the tube was withdrawn, diluted in NTE and centrifuged at 
27,000 rpm for 90 niinutes. The pellet was suspended in phosphate-buffered saline 
(PBS). These viral panicles were used as immunogens for preparation of antisera. 

Antisera was prepared as follows. 100-200 micrograms of purified 
virus in complete Freund's adjuvant was injected into the subscapula of a New 

2 0 Zealand White rabbit. A second injection in incomplete Freund's adjuvant was done 
4 weeks later, and the animals were bled 7-10 days later. 



AMPL^ 24 - MQDinCATIQN AND EXPRESSION QF H3HA2 FIJSIQN 

PROTEINS 

2 5 The modified nucleotide sequences encoding the H3HA2 proteins 

were prepared by mutating the nucleotide sequences of the fusion proteins prepared 
according to Example 22 above. Site directed mutagenesis using die Altered Sites 
System [Promega Corporation] according to the manufacturer's directions was used 
to change nucleotide numbers, 622, 625 and 634 (A to Q and 624, 627, and 636 (G 

30 to T) of nucleotide sequences [SEQ ID N0:91 encoding die NSl(i.81)H3HA2(i. 
221) fusion protein of Fig. 3 [SEQ ID NO:10], thereby changing the codons at these 
regions from AGO to CX3T, both encoding Arg. These changes correspond to 
nucleotide numbers 367, 370 and 379 (A to C) and 369, 372 and 381 (G to T) of the 
HA2 fragment of Fig. 7 [SEQ ID NO: 1]. 

35 Fig. 2 illustrates the modified nucleotide sequences of the fusion 

proteins [SEQ ID NO: 58] by contrast with the nucleotide sequence [SEQ ID N0:9] • 
of the "unmodified" fusion proteins (nucleotide changes below and amino acid 



.49- 



wo 94/17826 



PCT/US94/01149 



changes in above sequences of unmodified fusion protein). Mutagenesis on this 
sequence was carried out according to the method provided with the pSelect kit 
from Promega. 

A. NSl(i.81)H3HA2(i-221) [SEQ ID NO: 10] 
5 ^ Briefly, cloning for the mutagenesis was performed as follows. The 

pSelect plasmid [Promega] and pMGlH3HA2 (Example 22) were each digested 
with Hindll L These two plasmids were ligated together and selected on tetracycline 
plates. The resulting vector is pSeIH3HA2. Mutagenesis was performed according 
to Promega's kit. The following oligonucleotide was used: SEQ ID NO:68: 

1 0 5'-A AACTGTTTG AAAAAACAOG TCGTCAACTG CGTGAAAATG 
CTGACG ACAT GGGC -3'. 

Clones were verified by restriction endonuclease Hiofill. The 
resulting plasniid, pSelH3HA2mut5585 was digested with Nco l and Xbal > and a 748 
bp fragment coding for the H3HA2mut5585 polypeptide was isolated 

1 5 pOTS208NS 1 8 1 (Eco-740) was digested with Nco l and Xbal . The ligation of linear 
pOTS208NS181Nco and the 748 bp fragment resulted in pOTS208NSlH3mut5585 
[SEQ ID NO:58], This vector codes for the polypeptide, NSl(i,81)H3HA2(i.221) 
[SEQ ID NO: 10]. 

20 B, Expression of mutated NSl£K^H3HA2 proteins The 

plasmid of A was transfected into £. coli strain AR58 [SmithKline Beecham]. 
Cultures are grown at 32**C to nrid-log phase at which tiane cultures are shifted to 
. 39.5*'C for two hours. The E, coli cell pellets containing the recombinant 
polypeptide are then stored at «70°C until used. Production of the NSl(i. 

25 81 )H3HA2( i -22 1 ) protein [SEQ ID NO: 1 0] is confirmed by Western blot analysis 
[Towbin et cd., Proc. Nad. Acad. Sci. U.S.A., 2fi:4350 (1979)] using antisera 
prepared against AAJdom virus, as described in Example 23. A major 
immunoreactive species is expected at a molecular weight of approximately 35^00 
daltons. 

3 0 The expression levels obtained are about 50- 100% higher than those 

obtained by the expression of the unmodified coding sequences in the same 
expression system. 
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EXAMPLE 25 - tRNA INSERTION INTO HOST CELLS EXPRESSING H3 
PROTEIN 

E. coli host cells containing H3N2 fusion protein obtained as 
described in Example 22 above were transformed using conventional techniques. 
5 See, e.g. Sambrook et al, cited above. 

Briefly* a culture of E. coli strain MM294cI'^ containing the plasmid 
pDC9S2 was grown overnight in Luria broth with chloramphenicol. The plasmid 
pDC9S2 carries the argU gene which encodes the tRNA that recognizes the 
AGA/AGG codons [?• Saxcna and J. Walker. J. Bacteriol» J24(6): 1956- 1964 (Mar. 

10 1992)]. From this culture the plasmid pDC952 was prepared. A second culture of 
E, coli. strain AR13 [SmithKline Beccham] carrying the plasmid for the H3 flu 
antigen, was grown overnight in Luria broth with kanamycin. These cells were 
made competent for transformation as described below. 

The H3/AR13 overnight culture was diluted 1:50 in LB and 

1 5 kanamycin (50 mL total) and incubated at 3TC until it reached an O.D.650 of 0.6. 
The culture was then transferred to a 50 mL conical tube and chilled at about 4^C. 
Following this, the tube was centrifiiged in a TJ6 centrifuge (10 min; 2000-3000 
rev/min), the pellet resuspended in 25 mL 100 mM (3aCl2. and placed on ice for 
about 30 minutes. The pellet was then centrifuged as described above and 

2 0 resuspended in about 2.5 mL 100 mM CaCl2. 

The competent cells were aliquoted (100 ^1) into three separate 
sterile tubes. The first tube was the negative control and did not receive any DNA. 
The second tube was a positive control and 1 |Xl of plasmid pTyU was added to the 
cells. To the third tube was added 3 ^il of pDC952. These controls served to ensure 

25 that transformation occurred. Each tube of cells was mixed, placed on ice for 60 
min., heat shocked at 37X in a water bath for 2 minutes, and incubated in a 32®C 
water bath for 60 min. after adding 1 mL LB. The tubes were then microfuged for 1 
minute and die supematants poured off until only about 200 ^L were left. The 
pellets were then resuspended in the remaining supernatant and plated as follows: 

30 (1) on LB and chloramphenicol, (2) on LB and ampicillin» and (3) on LB and 

chloramphenicol and kanamycin. The plates were then incubated at 32*^C overnight 
Shake flasks were inoculated with the control strain, H3/AR13, and 4 
transformants, pDCS52/H3/AR13, and grown at 32°C to an optical density of 0.6 to 
0.7 at which point the cultures were shifted to 39.5°C for 3 hours. Samples were 

35 taken at induction stan (temperature shift to 39.5°C) and 3 hours post-induction. 
These samples were analyzed by high performance liquid chromatography (HPLC) 
and Western blotting, 
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The results of these analyses indicated that expression of H3 had 
increased by as much as 80% and the presence of the argU gene had eliminated the 
lowest western positive band as compared with the wild-type constructs (H3/AR13). 
It is believed that these results were obtained by eliminating the frameshifting 
5 caused by tandem AGO rare arginine codons. Further, there did not appear to be 
any difference in product quality between the H3 mutant prepared according to 
Example 24, and the argU tRNA transformants made according to this Example. 

Numerous modifications and variations of the present invention are included 
in die above-identified specification and are expected to be obvious to one of skill in 
10 the art Such modifications and alterations to the compositions and processes of the 
present invention are believed to be encompassed in tiie scope of die claims 
appended hereto. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



10 



15 



20 



(i) APPLICANT: Shatzman, Allan 
. Scott, Miller 
Dillon, Susan B. 
Kane, James 

(ii) TITLE OF INVENTION: Vaccinal Polypeptides 

<lii) NUMBER OF SEQUENCES: 72 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SmithKline Beecham Corporation - Corporate 
Patents 

<B) STREET: U.S. Mailcode UW2220 - 709 Swedeland Road 

(C) CITY: King of Prussia 

(D) STATE: Pennsylvania 

(E) COUNTRY: USA 

(F) ZIP: 19406-2799 



25 



30 



35 



40 



45 



(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1,0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 149,150 

(B) FILING DATE: 05-NOV-1993 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 013,415 

(B) FILING DATE: Ol-FEB-1993 

{vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 108,914 

(B) FILING DATE: 18-AUG-1993 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 837,773 

(B) FILING DATE: 18-FEB-1992 

50 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 751,896 

(B) FILING DATE: 30-AUG-1991 
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(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 387,200 

(B) FILING DATE: 28-JUL-1989 

5 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 238.801 

(B) FILING DATE: 02-NOV-'1988 

10 (vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 645,732 

(B) FILING DATE: 30-AUG-1984 

(viii) ATTORNEY/ AGENT INFORMATION: 
15 (A) NAME: Baumeister, Kirk 

(B) REGISTRATION NUMBER: 33,833 

(C) REFERENCE/DOCKET NUMBER: P50134 PCT 

(ix) TELECOMMUNICATION INFORMATION: 
20 (A) TELEPHONE: 215-270-5096 

(B) TELEFAX: 215-270-5090 



25 



(2) INFORMATION FOR SEQ ID N0:1: 



<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 666 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
30 <D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 
35 (A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 663 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

40 GGC ATA TTC GGC GCA ATA GCA GGT TTC ATA GAA AAT GGT TGG GAG GGA 
Gly He Phe Gly Ala He Ala Gly Phe He Glu Asn Gly Trp Glu Gly 
15 10 15 



55 



48 



ATG ATA GAC GGT TGG TAC GGT TTC AGG CAT CAA AAT TCT GAG GGC ACA 96 

45 Met He Asp Gly Trp Tyr Gly Phe Arg His Gin Asn Ser Glu Gly Thr 

20 25 30 

GGA CAA GCA GCA GAT CTT AAA AGC ACT CAA GCA GCC ATC GAC CAA ATC 144 

Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala He Asp Gin He 

50 35 40 45 

AAT GGG AAA CTG AAT AGG GTA ATC GAG AAG ACG AAC GAG AAA TTC CAT 192 

Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr Asn Glu Lys Phe His 

50 55 60 
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CAA ATC GAA AAG GAA TTC TCA GAA GTA GAA GGG AGA ATT CAG GAC CTC 240 
Gin lie Glu Lys Glu Phe Ser Glu Val Glu Gly Arg lie Gin Asp Leu 
65 70 75 80 

5 GAG AAA TAG GTT GAA GAC ACT AAA ATA GAT CTC TGG TCT TAG AAT GCG 288 
Glu Lys Tyr Val Glu Asp Thr Lys lie Asp Leu Trp Ser Tyr Asn Ala 
85 90 95 

GAG CTT CTT GTC GCT CTG GAG AAC CAA CAT ACA ATT GAT CTG ACT GAC 336 
10 Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr lie Asp Leu Thr Asp 
100 105 110 

TCG GAA ATG AAC AAA CTG TTT GAA AAA ACA AGG AGG CAA CTG AGG GAA 384 
Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 
15 115 120 125 

AAT GCT GAG GAC ATG GGC AAT GGT TGC TTC AAA ATA TAC CAC AAA TGT 432 
Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys lie Tyr His Lys Cys 
130 135 140 

20 

GAC AAT GCT TGC ATA GGG TCA ATC AGA AAT GGG ACT TAT GAC CAT GAT 480 
Asp Asn Ala Cys lie Gly Ser lie Arg Asn Gly Thr Tyr Asp His Asp 
145 150 155 160 

25 GTA TAC AGA GAC GAA GCA TTA AAC AAC CGG TTT CAG ATC AAA GGT GTT 528 
Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 
165 170 175 

GAA CTG AAG TCA GGA TAC AAA GAC TGG ATC CTG TGG ATT TCC TTT GCC 576 
30 Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe Ala 
180 185 .190 

ATA TCA TGC TTT TTG CTT TGT GTT GTT TTG CTG GGG TTC ATC ATG TGG 624 
He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met Trp 
35 195 200 205 

GCC TGC CAG AAA GGC AAC ATT AGG TGC AAC ATT TGC ATT TGA ■ 666 
Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 



40 



^210 215 220 

(2) INFORMATION FOR SEQ ID NO: 2: 



(1) SEOUENCE CHARACTERISTICS: 

.A) LENGTH: 221 amino acids 
45 ,3) TYPE: amino acid 

D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

50 .<xi) SEQUENCE DESCRIPTION: SEQ ID N0:2; 

Gly He Phe Gly Ala He Ala Gly Phe He Glu Asn Gly Trp Glu Gly 
15 10 15 
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Met lie Asp Gly Trp Tyr Gly Phe Arg His Gin Asn Ser Glu Gly Thr 
20 25 30 

Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala lie Asp Gin lie 
5 35 40 45 

Asn Gly Lys Leu Asn Arg Val lie Glu Lys Thr Asn Glu Lys Phe His 
50 55 60 

10 Gin lie Glu Lys Glu Phe Ser Glu Val Glu Gly Arg lie Gin Asp Leu 
65 70 75 80 



15 



Glu Lys Tyr Val Glu Asp Thr Lys lie Asp Leu Trp Ser Tyr Asn Ala 
85 90 95 

Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr Asp 

100 105 110 



Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 
20 115 120 125 

Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys Cys 

130 135 140 

25 Asp Asn Ala Cys He Gly Ser lie Arg Asn Gly Thr Tyr Asp His Asp 

145 150 155 160 



30 



Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 
165 170 175 

Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe Ala 
ISO 185 190 



He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met Trp 
35 195 200 205 

. Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 
210 215 220 

40 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 666 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

50 

(ix) FEATURE: 

<A) NAME/KEY: CDS 
<B) LOCATION: 1..663 
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10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GGC ATA TTC GGC GCA ATA GCA GOT TTC ATA GAA AAT GGT TGG GAG GGA 48 
Gly lie Phe Gly Ala He Ala Gly Phe He Glu Asn Gly Trp Glu Gly 
15 10 15 

ATG ATA GAC GGT TGG TAC GGT TTC AGG CAT CAA AAT TCC GAG GGC AC A 96 
Met He Asp Gly Trp Tyr Gly Phe Arg His Gin Asn Ser Glu Gly Thr 
20 25 30 

GGA CAA GCA GCA GAT CTT AAA AGC ACT CAA GCA GCC ATC GAC CAA ATC 144 
Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala He Asp Gin He 
35 40 45 

15 AAT GGG AAA CTG AAT AGG GTA ATC GAG AAG ACG AAC GAG AAA TTC CAT 192 
Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr Asn Glu Lys Phe His 
50 55 60 

CAA ATC GAA AAG GAA TTC TCA GAA GTA GAA GGG AGA ATT CAG GAC CTC 240 
20 Gin He Glu Lys Glu Phe Ser Glu Val Glu Gly Arg He Gin Asp Leu 
.65 70 75 . 80 

GAG AAA TAC GTT GAA GAC ACT AAA ATA GAT CTC TGG TCT TAC AAT GCG 288 
Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp Ser Tyr Asn Ala 
25 85 90 95 

GAG CTT CTT GTC GCT CTG GAG AAC CAA CAT ACA ATT GAT CTG ACT GAC 336 
Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr Asp 
100 105 110 

30 

TCG GAA ATG AAC AAA CTG TTT GAA AAA ACA AGG AGG CAA CTG AGG GAA 384 
Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 
115 120 125 

35 AAT GCT GAG GAC ATG GGC AAT GGT TGC TTC AAA ATA TAC CAC AAA TGT 432 
Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys Cys 
130 135 140 

GAC AAT GCT TGC ATA GGG TCA ATC AGA AAT GGG ACT TAT GAC CAT GAT .480 
40 Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thr Tyr Asp His Asp 
145 150 155 160 

GTA TAC AGA GAC GAA GCA TTA AAC AAC CGG TTT CAG ATC AAA GGT GTT 528 
Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 
45 165 170 175 

GAA CTG AAG TCA GGA TAC AAA GAC TGG ATC CTG TGG ATT TCC TTT GCC 576 
Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe Ala 
180 185 190 



50 



ATA TCA TGC TTT TTG CTT TGT GTT GTT TTG CTG GGG TTC ATC ATG TGG 624 
He Ser Cys Phe Leu Leu Cys val Val Leu Leu Gly Phe He Met Trp 
195 200 205 
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GCC TGC CAA AAA GGC AAC ATT AGG TGC AAC ATT TGC ATT TGA 666 
Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acids 
10 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Gly He Phe Gly Ala He Ala Gly Phe He Glu Asn Gly Trp Glu Gly 
1 5 10 15 

20 Met He Asp Gly Trp Tyr Gly Phe Arg His Gin Asn Ser Glu Gly Thr 
20 25 30 

Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala He Asp Gin He 
35 40 45 



25 



Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr Asn Glu Lys Phe His 
50 55 60 



Gin He Glu Lys Glu Phe Ser Glu Val Glu Gly Arg He Gin Asp Leu 
30 €5 70 75 80 

Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp Ser Tyr Asn Ala 
85 90 95 

35 Glu Leu Leu Val. Ala Leu Glu Asn Gin His Thr He Asp Leu Thr Asp 
100 105 110 



40 



Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 
115 120 125 

Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys Cys 
130 135 140 



Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thr Tyr Asp His Asp 
45 145 150 155 160 

Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 
• 165 170 .175 

50 Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe Ala 
180 185 190 

« 

He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met Trp 

195 200 205 
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Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 670 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..666 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 

GGT CTA TTT GGA GCC ATT GCC GGT TTT ATT GAA GGG GGA TGG ACT GGA 
Gly Leu Phe Gly Ala He Ala Gly Phe He Glu Gly Gly Trp Thr Gly 
15 10 15 

ATG ATA GAT GGA TGG TAG GGT TAT CAT CAT CAG AAT GAA CAG GGA TCA 
Met He Asp Gly Trp Tyr Gly Tyr His His Gin Asn Glu Gin Gly Ser 
20 25 30 

, GGC TAT GCA GCG GAT CAA AAA AGC ACA CAA AAT GCC ATT AAC GGG ATT 
Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn Ala He Asn Gly He 
35 40 45 

ACA AAC AAG GTG AAC TCT GTT ATC GAG AAA ATG AAC ATT CAA TTC ACA 
Thr Asn Lys Val Asn Ser Val He Glu Lys Met Asn He Gin Phe Thr 
50 55 60 

GCT GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA AGG ATG GAA AAT TTA 
Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu 
65 70 75 80 

AAT AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG ACA TAT AAT GCA 
Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala 
85 90 95 

GAA TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG GAT TTC CAT GAC 
Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp 
100 105 110 

TCA AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC CAA TTA AAG AAT 
Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn 
115 120 125 

AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC TAC CAC AAG TGT 
Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys 
130 135 140 
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GAC AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG ACT TAT GAT TAT CCC 480 
Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro 
145 150 155 160 

5 

AAA TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG GTA GAT GGA GTG 528 
Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val 
165 170 175 

10 AAA TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GCG ATC TAG TCA ACT 57 6 

Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr 
180 185 190 

GTC GCC AGT TCA CTG GTG CTT TTG GTC TCC CTG GGG GCA ATC AGT TTC 624 
15 Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe 
195 200 205 

TGG ATG TGT TCT AAT GGA TCT TTG CAG TGC AGA ATA TGC ATC 666 
Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
20 210 215 220 



25 



30 



TGAG 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 222 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

35 Gly Leu Phe Gly Ala He Ala Gly Phe He Glu Gly Gly Trp Thr Gly 
1 5 10-15 

Met He Asp Gly Trp Tyr Gly Tyr His His Gin Asn Glu Gin Gly Ser 
20 25 30 ' 

Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn Ala He Asn Gly He 
35 40 45 



40 



Thr Asn Lys Val Asn Ser Val He Glu Lys Met Asn He Gin Phe Thr 

45 50 55 60 

Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu 

65 70 75 80 



50 



Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala 
85 90 95 



670 



-60- 



wo 94/17826 



PCT/US94/01149 



Glu Leu Leu val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp 
100 105 110 

Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn 
5 115 120 125 

Asn Ala Lys Glu lie Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys 
130 135 140 

10 Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro 
145 150 155 160 



15 



40 



Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val 
165 170 175 

Lys Leu Glu Ser Met Gly lie Tyr Gin He Leu Ala He Tyr Ser Thr 
180 185 190 



Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe 
20 195 200 205 

Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
210 215 220 

25 (2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 670 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

35 <ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..670 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

GGCATATTCG GCGCAATAGC AGGTTTCATA GAAAATGGTT GGGAGGGAAT GATAGACGGT 60 

TGGTACGGTT TCAGGCATCA AAATTCNGAG GGCACAGGAC AAGCAGCAGA TCTTAAAAGC 120 

45 ACTCAAGCAG CCATCGACCA AATCAATGGG AAACTGAATA GGGTAATCGA GAAGACGAAC 180 

GAGAAATTCC ATCAAATCGA AAAGGAATTC TCAGAAGTAG AAGGGAGAAT TCAGGACCTC 240 

GAGAAATACG TTGAAGACAC TAAAATAGAT CTCTGGTCTT ACAATGCGGA GCTTCTTGTC 300 

50 

GCTCTGGAGA' ACCAACATAC AATTGATCTG ACTGACTCGG AAATGAACAA ACTGTTTGAA 360 

AAAACAAGGA GGCAACTGAG GGAAAATGCT GAGGACATGG GCAATGGTTG CTTCAAAATA 420 

55 TACCACAAAT GTGACAATGC TTGCATAGGG TCAATCAGAA ATGGGACTTA TGACCATGAT 480 
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GTATACAGAG ACGAAGCATT T^CAACCGG TTTCAGATCA AAGGTGTTGA ACTGAAGTCA 540 

GGATACAAAG ACTGGATCCT GTGGATTTCC TTTGCCATAT CATGCTTTTT GCTTTGTGTT 600 

5 

GTTTTGCTGG GGTTCATCAN NNTGTGGGCC TGCCANAAAG GCAACATTAG GTGCAACATT 660 

TGCATTTGAN 670 

10 (2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 222 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
15 (D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Gly lie Phe Gly Ala He Ala Gly Phe He Glu Asn Gly Trp Glu Gly 
1 5 10 15 



20 



Met He Asp Gly Trp Tyr Gly Phe Arg His Gin Asn Ser Glu Gly Thr 
25 20 .25 30 . 

Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala He Asp Gin He 
35 40 45 

30 Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr Asn Glu Lys Phe His 

50 55 60 



35 



50 



Gin He Glu Lys Glu Phe Ser Glu Val Glu Gly Arg He Gin Asp Leu 
65 70 75 80 

Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp Ser Tyr Asn Ala 
85 90 95 



Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr Asp 

40 100 105 110 

Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 

115 120 125 

45 Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys Cys 

130 135 140 



Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thr Tyr Asp His Asp 

145 150 155 160 

Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin ULe Lys Gly Val 

165 170 175 



Glu Leu Lys Ser Xaa Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe 
55 180 185 190 
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Ala lie Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met 

195 200 205 

Trp Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 

210 215 220 



10 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 918 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

20 (ix) FEATURE: 

(A) NAME/KEY: CDS • 

(B) LOCATION: 1..918 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 48 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 

1 5 10 15 

30 CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA- TTC 96 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 

20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 144 

35 Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 

35 40 45 

ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT GGA AAG CAG ATA 192 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 

40 50 55 60 

GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 240 

Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 

65 70 75 80 

45 

ATG GGC GCC CAT ATG GGC ATA TTC GGC GCA ATA GCA GGT TTC ATA GAA " 288 

Met Gly Ala His Met Gly He Phe Gly Ala He Ala Gly Phe He Glu 

85 90 . 95 

50 AAT GGT TGG GAG GGA ATG ATA GAC GGT TGG TAC GGT TTC AGG CAT CAA 336 

Asn Gly Trp Glu Gly Met He Asp Gly Trp Tyr Gly Phe Arg His Gin 

100 105 110 



-63- 



wo 94/17826 



PCT/US94/01149 



AAT TCT GAG GGC AC A GGA CAA GCA GCA GAT CTT AAA AGC ACT CAA GCA 384 
Asn Ser Glu Gly Thr Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala 
lis 120 125 

5 GGC ATC GAG CAA ATC AAT GGG AAA CTG AAT AGG GTA ATC GAG AAG ACG 432 
Ala lie Asp Gin He Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr 
130 135 140 

AAC GAG AAA TTC CAT CAA ATC GAA AAG GAA TTC TCA GAA GTA GAA GGG 480 
10 Asn Glu Lys Phe His Gin He Glu Lys Glu Phe Ser Glu Val Glu Gly 
145 150 155 160 

AGA.ATT CAG GAC CTC GAG AAA TAC GTT GAA GAC ACT AAA ATA GAT CTC 528 
Arg He Gin Asp Leu Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu 
15 165 170 ' 175 

TGG TCT TAC AAT GCG GAG CTT CTT GTC GCT CTG GAG AAC CAA CAT ACA 57 6 

Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr 
180 185 190 

20 

ATT GAT CTG ACT GAC TCG GAA ATG AAC AAA CTG TTT GAA AAA ACA AGG 62 < 

He Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg 
195 200 205 

25 AGG CAA CTG AGG GAA AAT GCT GAG GAC ATG GGC AAT GGT TGC TTC AAA 672 
Arg Gin Leu Arg Glu Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys 

210 215 220 

ATA TAC CAC AAA TGT GAC AAT GCT TGC ATA GGG TCA ATC AGA AAT GGG 720 
30 He Tyr His Lys Cys Asp Asn Ala Cys He Gly Ser He Arg Asn Gly 
225 230 235 240 

ACT TAT GAC CAT GAT GTA TAC AGA GAC GAA GCA TTA AAC AAC CGG TTT 768 
Thr Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe 
35 245 250 255 

CAG ATC AAA GGT GTT GAA CTG AAG TCA GGA TAC AAA GAC TGG ATC CTG 816 

Gin He Lys Gly Val Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu 

260 265 270 

40 

TGG ATT TCC TTT GCC .ATA TCA TGC TTT TTG CTT TGT GTT GTT TTG CTG 864 

Trp He Ser Phe Ala He Ser Cys Phe Leu Leu Cys Val Val Leu Leu 

275 280 285 

45 GGG TTC ATC ATG TGG GCC TGC CAA AAA GGC AAC ATT AGG TGC AAC ATT 912 
Gly Phe He Met Trp Ala Cys Gin Lys Gly Asn He Arg Cys Asn He 
290 295 300 



TGC ATT 
50 Cys* He 
305 



918 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 306 amino acids 
5 (B) TYPE; amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

15 His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 



20 



Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin lie 
50 55 60 



Val Glu Arg lie Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 

25 65 70 75 80 

Met Gly Ala His Met Gly lie Phe Gly Ala He Ala Gly Phe He Glu 

85 90 95 

30 Asn Gly Trp Glu Gly Met He Asp Gly Trp Tyr Gly Phe Arg His Gin 

100 105 110 



35 



Asn Ser Glu Gly Thr Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala 
115 . 120 125 

Ala He Asp Gin He Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr 
130 135 140 



Asn Glu Lys Phe His Gin He Glu Lys Glu Phe Ser Glu Val Glu Gly 
40 145 150 155 160 

Arg He Gin Asp Leu Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu 
165 170 175 

45 Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr 

180 185 190 



50 



He Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg 
195 . 200 205 

Arg Gin Leu Arg Glu Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys 
210 215 220 



He Tyr His Lys Cys Asp Asn Ala Cys He Gly Ser He Arg Asn Gly 
55 225 230 235 240 
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Thr Tyr Asp His Asp val Tyr 
245 

5 Gin lie Lys Gly Val Glu Leu 
260 

Trp He Ser Phe Ala He Ser 
275 

10 

Gly Phe He Met Trp Ala Cys 
290 295 

Cys He 
15 305 



Arg Asp Glu Ala Leu Asn Asn Arg Phe 
250 255 

Lys Ser Gly Tyr Lys Asp Trp He Leu 
265 270 

Cys Phe Leu Leu Cys Val Val Leu Leu 
280 285 

Gin Lys Gly Asn He Arg Cys Asn He 
300 



(2) INFORMATION FOR SEQ ID NO: 11: 

20 (i) SEQUENCE CHT^CTERISTICS: 

(A) LENGTH: 690 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: unknown 

25 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
30 X LOCATION: 1..690 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATG GAT CCA AAC ACT GTG TCA AGO TTT CAG GTA GAT TGC TTT CTT TGG 48 
35 Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
4 0 20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 144 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly . Ser 

35 40 45 

45 

ACT CTT GGT CTG GAC ATC GAG AC A GCC ACA CGT GCT GGA AAG CAG ATA 192 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

50 GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 240 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 
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ATG GAT CAT ATG TTA ATT CAG GAC CTC GAG AAA TAG GTT GAA GAC ACT 
Met Asp His Met Leu lie Gin Asp Leu Glu Lys Tyr Val Glu Asp Thr 
85 96 95 

AAA ATA GAT CTC TGG TCT TAG AAT GCG GAG CTT CTT GTC GCT CTG GAG 
Lys lie Asp Leu Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu 
100 105 110 

AAC CAA CAT ACA ATT GAT CTG ACT GAC TCG GAA ATG AAC AAA CTG TTT 
Asn Gin His Thr lie Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe 
115 120 125 

GAA AAA ACA AGG AGG CAA CTG AGG GAA AAT GCT GAG GAC ATG GGC A^IT 
Glu Lys Thr Arg Arg Gin Leu Arg Glu Asn Ala Glu Asp Met Gly Asn 
130 135 140 

GGT TGC TTC AAA ATA TAC CAC AAA TGT GAC AAT GCT TGC ATA GGG TCA 
Gly Cys Phe Lys He Tyr His Lys Cys Asp Asn Ala Cys He Gly Ser 
145 150 155 160 

ATC AGA AAT GGG ACT TAT GAC CAT GAT GTA- TAC AGA GAC GAA GCA TTA 
He Arg Asn Gly Thr Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu 
165 170 175 

AAC AAC CGG TTT CAG ATC AAA GGT GTT GAA CTG AAG TCA GGA TAC AAA 
Asn Asn Arg Phe Gin He Lys Gly Val Glu Leu Lys Ser Gly Tyr Lys 
180 185 190 

GAC TGG ATC CTG TGG ATT TCC TTT GCC ATA TCA TGC TTT TTG CTT TGT 
Asp Trp He Leu Trp He Ser Phe Ala He Ser Cys Phe Leu Leu Cys 
195 200 205 

GTT GTT TTG CTG GGG TTC ATC ATG TGG GCC TGC CAA AAA GGC AAC ATT 
Val Val Leu Leu Gly Phe He Met Trp Ala Cys Gin Lys Gly Asn He 
210 215 220 



AGG TGC AAC ATT TGC ATT 
Arg Cys Asn He Cys He 
225 230 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 230 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



Met Asp Pro Asn Thr Val Ser Ser Phe Gin ,Val Asp Cys Phe Leu Trp 
1 , 5 10* * 15 
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His val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
5 35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

10 Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 



15 



Met Asp His Met Leu He Gin Asp Leu Glu Lys Tyr Val Glu Asp Thr 
85 90 95 

Lys He Asp Leu Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu 
100 105 110 



Asn Gin His Thr He Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe 
20 115 120 125 

Glu Lys Thr Arg Arg Gin Leu Arg Glu Asn Ala Glu Asp Met Gly Asn 
130 .135 140 

25 Gly Cys Phe Lys He Tyr His Lys Cys Asp Asn Ala Cys He Gly Ser 
145 150 155 160 



30 



He Arg Asn Gly Thr Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu 
165 - 170 . 175 

Asn Asn Arg Phe Gin He Lys Gly Val Glu Leu Lys Ser Gly Tyr Lys 
180 185 190 



Asp Trp He Leu Trp He Ser Phe Ala He Ser Cys Phe Leu Leu Cys 

35 195 200 205 

Val Val Leu Leu Gly Phe He Met Trp Ala Cys Gin Lys Gly Asn He 

210 215 220 



40 Arg Cys Asn He Cys He 
225 230 



45 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(AJ LENGTH: 699 base pairs 
(0) TYPE: nucleic acid 
50 (C) STRANDEDNESS: double 

<D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: l.,699 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TCC TTT CTT TGG 48 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Ser Phe Leu Trp 
15 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

15 CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC ATG CAT GGA TCA TAT GTT 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Met His Gly Ser Tyr Val 
35 40 45 



10 



20 



40 



50 



AAC AAG ACA CAA GAA GCT ATA AAC AAG ATA ACA AAA AAT CTC AAC TAT • 192 

Asn Lys Thr Gin Glu Ala He Asn Lys He Thr Lys Asn Leu Asn Tyr 
50 55 60 



TTA AGT GAG CTA GAA GTA AAA AAC CTT CAA AGA CTA AGC GGA GCA ATG 240 
Leu Ser Glu Leu Glu Val Lys Asn Leu Gin Arg Leu Ser Gly Ala Met 
25 65 70 75 80 

AAT GAG CTT CAC GAC GAA ATA CTC GAG CTA GAC GAA AAA GTG GAT GAT 288 
Asn Glu Leu His Asp Glu He Leu Glu Leu Asp Glu Lys Val Asp Asp 
85 90 95 

30 

CTA AGA GCT GAT ACA ATA AGC TCA CAA ATA GAG CTT GCA GTC TTG CTT 336 
Leu Arg Ala Asp Thr He Ser Ser Gin He Glu Leu Ala Val Leu Leu 
100 105 110 

35 TCC AAC GAA GGG ATA ATA AAC AGT GAA GAT GAG CAT CTC TTG GCA CTT 384 
Ser Asn Glu Gly He He Asn Ser Glu Asp Glu His Leu Leu Ala Leu 
115 120 125 



GAA AGA AAA CTG AAG AAA ATG CTT GGC CCC TCT GCT GTA GAA ATA GGG 432 
Glu Arg Lys Leu Lys Lys Met Leu Gly Pro Ser Ala Val Glu He Gly 
130 135 140 



AAT GGG TGC TTT GAA ACC AAA CAC AAA TGC AAC CAG ACT TGC CTA GAC 
Asn Gly Cys Phe Glu Thr Lys His Lys Cys Asn Gin Thr Cys Leu Asp 
45 145 150 155 160 



480 



AGG ATA GCT GCT GGC ACC TTT AAT OCA GGA GAT TTT TCT CTT CCC ACT 528 

Arg He Ala Ala Gly Thr Phe Asn Ala Gly Asp Phe Ser Leu Pro Thr 
165 170 175 

TTT GAT TCA TTA AAC ATT ACT GCT GCA TCT TTA AAT GAT GAT GGC TTG 576 

Phe Asp Ser Leu Asn He Thr Ala Ala Ser Leu Asn Asp Asp Gly Leu 
180 185 190 
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GAT AAT CAT ACT ATA CTG CTC TAG TAG TGA ACT GGT GGT TCT AGC TTG 624 

Asp Asn His Thr lie Leu Leu Tyr Tyr Ser Thr Ala Ala Ser Ser Leu 

195 200 205 

5 GCT GTA ACA TTA ATG ATA GGT ATC TTC ATT GTG TAG ATG GTG TGG AGA 672 

Ala Val Thr Leu Met He Ala He Phe He Val Tyr Met Val* Ser Arg 
210 215 220 

GAC AAT GTT TCT TGT TCC ATC TGT CTG . 699 

10 Asp Asn Val Ser Cys Ser He Cys Leu 

225 230 



(2) INFORMATION FOR SEQ ID NO: 14: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

25 Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Ser Phe Leu Trp 
1 5 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Met His Gly Ser Tyr Val 
35 40 45 

Asn Lys Thr Gin Glu Ala He Asn Lys He Thr Lys Asn Leu Asn Tyr 
35 50 55 60 

Leu Ser Glu Leu Glu Val Lys Asn Leu Gin Arg Leu Ser Gly Ala Met 
65 70 75 ' 80 

40 Asn Glu Leu His Asp Glu He Leu Glu Leu Asp Glu Lys Val Asp Asp 

85 90 95 

Leu Arg Ala Asp Thr He Ser Ser Gin He Glu Leu Ala Val Leu Leu 
100 105 110 

45 

Ser Asn Glu Gly He He Asn Ser Glu Asp Glu His Leu Leu Ala Leu 
115 120 125 

Glu Arg Lys Leu Lys Lys Met Leu Gly Pro Ser Ala Val Glu He Gly 
50 130 135 140 

Asn Gly Cys Phe Glu Thr Lys His Lys Cys Asn Gin Thr Cys Leu Asp 
145 150 155 160 
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Arg He Ala Ala Gly Thr Phe Asn Ala Gly Asp Phe Ser Leu Pro Thr 
165 170 175 

Phe Asp Ser Leu Asn He Thr Ala Ala Ser Leu Asn Asp Asp Gly Leu 
180 185 190 

Asp Asn His Thr He Leu Leu Tyr Tyr Ser Thr Ala Ala Ser Ser Leu 
195 200 205 

Ala Val Thr Leu Met He Ala He Phe He Val Tyr Met Val Ser Arg 
210 215 220 

Asp Asn Val Ser Cys Ser He Cys Leu 
225 230 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 924 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..921 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 

^ S 10 .15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG GAC ATC GAG ACA* GCC ACA CGT GCT GGA AAG CAG ATA 
Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

ATG GAT CTG TCC AGA GGT CTA TTT GGA GCC ATT GCC GGT TTT ATT GAA 
Met Asp Leu Ser Arg Gly Leu Phe Gly Ala He Ala Gly Phe He Glu 
85 90 95 
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GGG GGA TGG ACT GGA ATG ATA GAT GGA TGG TAC GGT TAT CAT CAT CAG 
Gly Gly Trp Thr Gly Met He Asp Gly Trp Tyr Gly Tyr His His Gin 
100 105 110 



336 



AAT GAA CAG GGA TCA GGC TAT GCA GCG GAT CAA AAA AGC ACA CAA AAT 
Asn Glu Gin Gly Ser Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn 
115 120 125 

10 GCC ATT AAC GGG ATT ACA AAC AAG GTG AAC TCT GTT ATC GAG AAA ATG 
Ala He Asn Gly He Thr Asn Lys Val Asn Ser Val He Glu Lys Met 
130 135 140 



384 



432 



AAC ATT CAA TTC ACA GCT GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA 
15 Asn He Gin Phe Thr Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys 
145 150 155 160 



480 



528 



20 



25 



30 



35 



40 



45 



50 



AGG ATG GAA AAT TTA AAT AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT 
Arg Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He 
165 170 175 

TGG ACA TAT AAT GCA GAA TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT 576 
Trp Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr 
180 185 190 

CTG GAT TTC CAT GAC TCA AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA 624 
Leu Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys 
195 200 205 

AGC CAA TTA AAG AAT AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG 672 
Ser Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu 
210 215 220 

TTC TAC CAC AAG TGT GAC AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG 720 
Phe Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly 
225 230 235 240 

ACT TAT GAT TAT CCC AAA TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA 768 
Thr Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu 
245 250 255 

AAG GTA GAT GGA GTG AAA TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG 816 
Lys Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu 
260 . 265 270 

GCG ATC TAC TCA ACT GTC GCC AGT TCA CTG GTG CTT TTG GTC TCC CTG 864 
Ala He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu 
275 280 285 

GGG GCA ATC AGT TTC TGG ATG TGT TCT AAT GGA TCT TTG CAG TGC AGA 912 
Gly Ala He Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg 
290 295 300 
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ATA TGC ATC TGA 924 

lie Cys lie 

305 



(2) INFORMATION FOR SEQ ID N0;16: 



(i) SEQUENCE CHARACTERISTICS: 
10 . (A) LENGTH: 307 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 



15 



20 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
25 20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

30 Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 

35 " 

Met Asp Leu Ser Arg Gly Leu Phe Gly Ala lie Ala Gly Phe He Glu 
85 90 95 

Gly Gly Trp Thr Gly Met He Asp Gly Trp Tyr Gly Tyr His His Gin 
40 100 105 110 

Asn Glu Gin Gly Ser Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn 
115 120 125 

45 Ala He Asn Gly He Thr Asn Lys Val Asn Ser Val He Glu Lys Met 
130 135 

Asn He Gin Phe Thr Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys 
50 "5 150 . 155 160 

Arg Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He 
165 170 175 

Trp Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr 
55 180 185 190 
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10 



25 



40 



50 



Leu Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys 
195 200 205 

Ser Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu 
210 215 220 

Phe Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly 
225 230 235 240 

Thr Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu 
245 * 250 255 



Lys Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu 
15 260 265 270 

Ala He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu 
275 280 285 

20 Gly Ala He Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg 
290 295 300 



He Cys He 
305 



(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 
30 <A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 

35 (ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE; 

(A) NAME/KEY: CDS 
<B) LOCATION: 1..726 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 48 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 

45 1 5 10 15 ' 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala. Pro Phe 
20 25 30 



CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 
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ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT GGA AAG CAG ATA 
Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin lie 
50 55 60 



192 



5 . GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 240 
Val Glu Arg lie Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 -70 75 80 

ATG CAG ATC CCG GCT GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA AGG 288 
10 Met Gin lie Pro Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg 

85 90 95 

ATG GAA AAT TTA AAT AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG 336 
Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp 
15 100 105 110 

ACA TAT AAT GCA GAA TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG 384 
Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
115 120 125 

20 

GAT TTC CAT GAC TCA AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC 432 
Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
130 135 140 

25 CAA TTA AAG AAT AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC 480 
Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe 
145 150 155 160 

TAC CAC AAG TGT GAC AAT GAA TGC ATG GAA AGT GTA AGA T^T GGG ACT 528 
30 Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 

165 170 175 

TAT GAT TAT CCC AAA TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG 576 
Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 
35 180 185 190 

GTA GAT GGA GTG AAA TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GCG 624 

Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 

195 200 205 

40 

ATC TAC TCA ACT GTC GCC AGT TCA CTG GTG CTT TTG GTC TCC CTG GGG 672 

lie Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Ldu Val Ser Leu Gly 

210 215 220 

45 GCA ATC AGT TTC TGG ATG TGT TCT AAT GGA TCT TTG CAG TGC AGA ATA 720 
Ala He Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He 
225 230 235 240 

TGC ATC TGA 729 
50 Cys He 
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10 



15 



(2) INFORMATION FOR SEQ ID N0:18: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 242 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 



Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 

20 35 40 45 

Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin lie 

50 55 60 

25 Val Glu Arg lie Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 

65 70 75 80 



30 



Met Gin lie Pro Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg 
85 90 95 

Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly. Phe Leu Asp lie Trp 
100 105 110 



Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 

35 115 120 125 

Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
130 135 140 



40 Gin Leu Lys Asn Asn Ala Lys 
145 150 

Tyr His Lys Cys Asp Asn Glu 
165 

45 

Tyr Asp Tyr Pro Lys Tyr Ser 
180 

Val Asp Gly Val Lys Leu Glu 
50 195 

lie Tyr Ser Thr Val Ala Ser 
210 215 



Glu lie Gly Asn Gly Cys Phe Glu Phe 
155 160 

Cys Met Glu Ser Val Arg Asn Gly Thr 
170 175 

Glu Glu Ser Lys Leu Asn Arg Glu Lys 
185 190 

Ser Met Gly lie Tyr Gin lie Leu Ala 
200 205 

Ser Leu Val Leu Leu Val Ser Leu Gly 
220 
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Ala lie Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg lie 
225 230 235 240 



Cys lie 



(2) INFORMATION FOR- SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 810 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEONESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(Ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..807 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC ATG GAT CTG TCC AGA GGT 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Met Asp Leu Ser Arg Gly 
35 40 45 

CTA TTT GGA GCC ATT GCC GGT .TTT ATT GAA GGG GGA TGG ACT GGA ATG 
Leu Phe Gly Ala lie Ala Gly Phe lie Glu Gly Gly Trp Thr Gly Met 
50 55 60 

ATA GAT GGA TGG TAC GGT TAT CAT CAT CAG AAT GAA CAG GGA TCA GGC 
He Asp Gly Trp Tyr Gly Tyr His His Gin Asn Glu Gin Gly Ser Gly 
^5 70 75 80 

TAT GCA GCG GAT CAA AAA AGC ACA CAA AAT GCC ATT AAC GGG ATT ACA 
Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn Ala He Asn Gly He Thr 
85 90 95 

AAC AAG GTG AAC TCT GTT ATC GAG AAA ATG AAC ATT CAA TTC ACA GCT 
Asn Lys Val Asn Ser Val He Glu Lys Met Asn He Gin Phe Thr Ala 
100 105 110 

GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA AGG ATG GAA AAT TTA AAT 
Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu Asn 
115 120 125 
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AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG AC A TAT AAT GCA GAA 432 

Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala Glu 

130 135 140 

5 

TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG GAT TTC CAT GAC TCA 480 

Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp Ser 

145 150 155 160 

10 AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC CAA TTA AAG AAT AAT 528 
Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn Asn 
165 170 175 

GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC TAC CAC AAG TGT GAC 57 6 

15 Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys Asp 
180 185 190 

AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG ACT TAT GAT TAT CCC AAA 624 
Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro Lys 
20 195 200 205 

TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG GTA GAT GGA GTG AAA 672 

Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val Lys 

210 215 220 

25 

TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GGG ATC TAC TCA ACT GTC 720 

Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr Val 
225 230 235 240 

30 GCC AGT TCA CTG GTG CTT TTG GTC TCC CTG GGG GCA ATC AGT TTC TGG 7 68 

Ala Ser Ser Leii Val Leu Leu Val Ser Leu Gly Ala He Ser Phe Trp 
245 250 255 

ATG TGT TCT AAT GGA TCT TTG CAG TGC AGA ATA TGC ATC TGA 810 
35 Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
260 265 



40 



45 



(2} INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 269 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear. 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

50 Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 . 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

55 
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Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Met Asp Leu Ser. Arg Gly 
35 40 45 

Leu Phe Gly Ala lie Ala Gly Phe He Glu Gly Gly Trp Thr Gly Met 
5 50 55 60 

He Asp Gly Trp Tyr* Gly Tyr His His Gin Asn Glu Gin Gly Ser Gly 
65 70 75 80 

10 Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn Ala He Asn Gly He Thr 

85 90 95 

Asn Lys Val Asn Ser Val lie Glu Lys Met Asn He Gin Phe Thr Ala 
100 105 110 

15 

Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu Asn 
115 120 125 

Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala Glu 
20 130 135 140 

Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp Ser 
l^S 150 155 160 

25 Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn Asn . 

165 170 175 

Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys Asp 
180 185 190 

30 

Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro Lys 
195 . 200 205 

Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val Lys 
35 210 215 220 

Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr Val 
225 230 235 240 

40 Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe Trp 

245 250 255 

Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
260 265 

45 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 630 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

55 (ii) MOLECULE TYPE: DNA (genomic) 

i 
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<ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..627 

5 

<xi) SEQUENCE DESCRIPTION: SEQ ID N0:21: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 48 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
10 1 '5 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 . 25 . 30 



15 



20 



CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC ATG GAT CAT ATG TTA AC A 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Met Asp His Met Leu Thr 
35 40 45 



AGT ACT CGA TCT GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA AGG ATG 192 
Ser Thr Arg Ser Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met 
50 55 60 

25 GAA AAT TTA AAT AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG ACA 240 
Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp lie Trp Thr 
65 ' 70 75 80 

TAT AAT GCA GAA TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG GAT 288 
30 Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp 

85 90 95 

TTC CAT GAC TCA AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC CAA 336 
Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin 
35 100 105 110 

TTA AAG AAT AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC TAC 384 
Leu Lys Asn Asn Ala Lys Glu lie Gly Asn Gly Cys Phe Glu Phe Tyr 
115 120 125 

40 

CAC AAG TGT GAC AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG ACT TAT 432 
His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr 
130 135 140 

45 GAT TAT CCC AAA TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG GTA 480 
Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val 
145 150 155 160 

GAT GGA GTG AAA TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GCG ATC 528 
50 Asp Gly Val Lys Leu Glu Ser Met Gly lie Tyr Gin He Leu Ala He 

165 170^ 175 

TAC TCA ACT GTC GCC AGT TCA CTG GTG CTT TTG GTC TCC CTG GGG GCA 576 
Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala 
55 180 185 190 
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5 



10 



40 



ATC AGT TTC TGG ATG. TGT TCT AAT GGA TCT TTG CAG TGC AGA ATA TGC 624 
lie Ser Phe Trp.Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys 
195 200 205 

ATC TGA 630 
He 



(2) INFORMATION FOR SEQ ID 1^0:22 j 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 209 amino acids 
15 (B) TYPE: amino acid 

.(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

20 (ki) SEQUENCE DESCRIPTION: SEQ ID N0:22: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
i 5 10 15 

25 His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 



Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Met Asp His Met Leu Thr 
30 35 40 45 

Ser Thr Arg Ser Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met 
50 55 60 

35 Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr 
65 70 75 80 



Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp 

85 90 95 

Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin 

100 105 110 



Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr 
45 115 120 125 

His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr 
130 135 140 

50 Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val 
145 150 155 160 



55 



Asp Gly val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He 
165 170 175 
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Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala 
180 185 190 

He Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys 
5 195 200 205 

He 



10 (2) INFORMATION FOR SEQ ID NO: 23: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 717 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 



(ii) MOLECULE TYPE: DNA (genomic) 

20 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..714 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 



48 



30 CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 



96 



CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 
35 Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 



144 



40 



ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT GGA AAG CAG ATA 
Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 



192 



GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 



65 



70 



75 



80 



45 



ATG CAG ATC CCG GAA TTC AAC AAA TTA GAA AAA AGG ATG GAA AAT TTA 
Met Gin He Pro Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu 
85 90 • 95 



240 



288 



50 AAT AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG ACA TAT AAT GCA 
Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala 
100 105 110 



336 
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GAA TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG GAT TTC CAT GAC 
Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp 
115 120 125 

TCA AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC CAA TTA AAG AAT 
Ser Asn Val Lys Asn- Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn 
130 135 140 

AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC TAC CAC AAG TGT 
Asn Ala Lys Glu lie Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys 
145 150 155 160 

GAC AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG ACT TAT GAT TAT CCC 
Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro 
165 170 175 

AAA TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG GTA GAT GGA GTG 
Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val 
180 185 190 

AAA TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GCG ATC TAC TCA ACT 
Lys Leu Glu Ser Met Gly lie Tyr Gin He Leu Ala He Tyr Ser Thr 
195 200 205 

GTC GCC AGT TCA CTG GTG CTT TTG GTC TCC CTG GGG GCA ATC AGT TTC 
Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe 
210 215 220 

TGG ATG TGT TCT AAT GGA TCT TTG CAG TGC AGA ATA TGC ATC 
Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
225 230 235 

TGA 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 238 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 
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Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

5 Vai Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

mt Gin He Pro Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu 
85 90 ' 95 



10 



Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala 
100 105 . 110 



Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp 
15 115 120 125 

Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn 
130 135 140 

20 Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys' 
145 150 155 160 

Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro 
165 170 175 



25 



i'Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val 
180 185 190 



Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr 

30 195 200 205 

Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe 
210 215 220 



35 Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
225 230 235 



40 (2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 681 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

(D) TOPOLOGY: un)cnown 

(ii) MOLECULE TYPE: DNA (genomic) 

50 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. ,678 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 
Met Asp Pro Asn Thr Vai Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1. 5 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GGC CCA TTC 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAATCC CTA AGA GGA AGG GGC AGC 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT GGA AAG CAG ATA 
Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 

50 55 60 

GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
.65 70 75 80 

ATG CAG ATC CCG AAT AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG 
Met Gin He Pro Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp 
85 90 95 

ACA TAT AAT GCA GAA TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG 
Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
100 105 110 

GAT TTC CAT GAC TCA AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC 
Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
115 120 125 

CAA TTA AAG AAT AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC 
Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe 
130 135 140 

TAC CAC AAG TGT GAC AAT GAA TGC ATG GAA fiGT GTA AGA AAT GGG ACT 
Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 
145 150 155 160 

TAT GAT TAT CCC AAA TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG 
Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 
165 170 175 



GTA GAT GGA GTG AAA TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GCG 
Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
180 185 190 

ATC TAC TCA ACT GTC GCC AGT TCA CTG GTG CTT TTG GTC TCC CTG GGG 
He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly 
195 200 205 
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GCA ATC AGT TTC TGG ATG TGT TCT AAT GGA TCT TTG CAG TGC AGA ATA 672 
Ala lie Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg lie 
210 215 220 

5 TGC ATC TGA 681 
Cys lie 
225 



10 (2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 226 amino acids 
<B) TYPE: amino acid 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 



20 



His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 

25 20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 

35 40 45 

30 Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin lie 

50 55 60 



35 



Val Glu Arg lie Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 

65 70 75 80 

Met Gin He Pro Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp 

85 90 95 



Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
40 100 105 110 

Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
115 120 125 

45 Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe 
130 135 140 

Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 
145 150 155 160 

50 

Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 
165 170 175 
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Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
180 185 190 

He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly 
5 195 200 205 

Ala He Ser Phe Trp" Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He 
210 215 220 

10 Cys He 
225 

(2) INFORMATION FOR SEQ ID NO: 27: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 158 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: unknown 

20 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

Met. Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 

25 1 5 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

30 Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 

35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 



35 



Met Gin He Pro Val Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro 
40 85 90 95 

Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val 
100 105 110 

45 Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr 

115 120 125 



50 



55 



Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe 
130 135 140 

Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
145 150 155 
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<2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 
5 <B) TYPE: amino acid 

<D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE; protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

15 His Val Arg Lys Arg val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 

20 25 30 



20 



30 



45 



Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 

35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 



Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
25 65 70 75 80 



Met Asp Leu Ser Arg Gly Leu Phe Gly Ala He Ala Gly Phe He Glu 
85 90 95 

Gly Gly Trp Thr Gly Met He Asp Gly Trp Tyr Gly Tyr His His Gin 
100 105 110 



Asn Glu Gin Gly Ser Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn 
35 115 120 125 

Ala He Asn Gly He Thr Asn Lys Val Asn Ser Val He Glu Lys Met 
130 135 140 

40 Asn He Gin Phe Thr Ala Val Gly Lys Glu Phe Ser Cys Leu Thr Ala 

145 150 155 160 



Tyr His Arg 



(2) INFORMATION FOR SEQ ID NO: 29: 



<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 231 amino acids 
50 (B) TYPE: amino acid 

(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

5 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
10 35 ■. 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 , 60 

15 Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 

65 70 . 75 80 



20 



Met Gin lie Pro Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg 
85 90 95 . 

Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp 
100 105 110 



Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
25 115 120 125 

Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
130 135 140 

30 Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe 

145 150 155 i60 

Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 
165 170 175 

35 

Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 
180 185 190 

Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
40 195 200 205 

He Tyr Ser Thr Val Ala Ser Ser Gly Gly Ser Tyr Ser Met Glu His 
210 215 220 

45 Phe Arg Trp Gly Lys Pro Val 

225 230 
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10 



15 



30 



45 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 225 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY; unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 



Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
20 35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

25 Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 

65 70 75 80 



Met Gin He Pro Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg 
85 90 95 

Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp 
100 105 110 



Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 

35 115 120 125 

Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
130 135 140 

40 Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe 

145 150 155 160 



Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 
165 170 175 

Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 
180 185 190 



Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 

50 195 200 205 

He Tyr Ser Thr Val Ala Ser Ser Gly Gly Ser Tyr Ser Met Leu Val 
210 215 220 
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Asn 
225 



(2) INFORMATION FOR SEQ ID NO: 31: 



10 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 912 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: unknown 



15 



20 



25 



30 



35 



40 



45 



50 



(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY; CDS 

(B) LOCATION: 1 . . 912 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT GGA AAG CAG ATA 
Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin lie 
50 55 60 

GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 . 75 80 

ATG CAG ATC CCG GGT CTA TTT GGA GCC ATT GCC GGT TTT ATT GAA GGG 
Met Gin He Pro Gly Leu Phe Gly Ala He Ala Gly Phe He Glu Gly 
85 90 95 

GGA TGG ACT GGA ATG ATA GAT GGA TGG TAG GGT TAT CAT CAT CAG AAT 
Gly Trp Thr Gly Met He Asp Gly^ Trp Tyr Gly Tyr His His Gin Asn 
100 105 110 . 

GAA CAG GGA TCA GGC TAT GCA GCG GAT CAA AAA AGC ACA CAA AAT GCC 
Glu Gin Gly Ser Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn Ala 
115 120 125 



48 



96 



144 



192 



240 



288 



336 



384 



-91- 



wo 94/17826 



PCTAJS94/01149 



ATT AAC GGG ATT ACA AAC AAG GTG AAC TCT GTT ATC GAG AAA ATG AAC 432 
lie Asn Gly lie Thr Asn Lys Val Asn Ser Val lie Glu Lys Met Asn 
130 135 140 

5 ATT CAA TTC ACA GCT GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA AGG 480 
lie Gin Phe Thr Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg 
145 ■ 150 155 1.60 

ATG GAA AAT TTA AAT AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG 528 
10 Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp 

165 170 175 

ACA TAT AAT GCA GAA TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG 57 6 

Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
15 180 185 190 

GAT TTC CAT GAC TCA AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC 624 
Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
195 200 205 

20 

CAA TTA AAG AAT AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC 672 
Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe 
210 215 220 

25 TAG CAC AAG TGT GAC AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG ACT 720 
Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 
225 230 235 . 240 

TAT GAT TAT CCC AAA TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG 7 68 

30 Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 

245 250 255 

•> ' 

GTA GAT GGA GTG AAA TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GCG 816 
Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
35 260 265 270 

ATC TAC TCA ACT GTC GCC AGT TCA CTG GTG CTT TTG GTC TCC CTG GGG 864 
He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly 
275 280 285 



40 



45 



GCA ATC AGT TTC TGG ATG TGT TCT AAT GGA TCT TTG CAG TGC AGA ATA 912 
Ala He Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He 
290 295 300 



(2) INFORMATION FOR SEQ ID NO: 32: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 amino acids 
50 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

Hi) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

5 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
10 35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

15 Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 



20 



35 



50 



Met Gin He Pro Gly Leu Phe Gly Ala He Ala Gly Phe He Glu Gly 
85 90 95 

Gly Trp Thr Gly Met He Asp Gly Trp Tyr Gly Tyr His His Gin Asn 
100 105 110 



Glu Gin Gly Ser Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn Ala 
25 115 120 125 

He Asn Gly He Thr Asn Lys Val Asn Ser Val He Glu Lys Met Asn 
130 135 140 

30 He Gin Phe Thr Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg 
145 150 155 160 



Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp 
165 170 175 

Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
180 185 190 



Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lvs Ser 
40 195 200 205 

Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe 
210 215 220 

45 Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 
225 230 235 240 



Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg - Glu Lys 
245 250 255 

Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
260 265 270 
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lie Tyr Ser Tnr Val Ala Ser Ser Leu val Leu Leu Val Ser Leu Gly 
275 / 280 285 

5 Ala lie Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg lie 



290 



295 



300 



10 



15 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 474 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : double 

(D) TOPOLOGY: unknown 



(ii) MOLECULE TYPE: DNA (genomic) 



20 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..471 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA AGG ATG GAA AAT TTA AAT 
Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu Asn 
30 1 5 10 15 

AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG ACA TAT AAT GCA GAA 
Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala Glu 
20 25 30 

35 

TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG GAT TTC CAT GAC TCA 
Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp Ser 
35 40 45 

40 AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC CAA TTA AAG AAT AAT 
Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn Asn 



48 



96 



144 



' 192 



50 



55 



60 



GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC TAC CAC AAG TGT GAC 
45 Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys Asp 
65 70 75 80 



240 



50 



55 



AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG ACT TAT GAT TAT CCC AAA 
Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro Lys 
85 90 95 

TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG GTA GAT GGA GTG AAA 
Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val Lys 
100 105 110 



288 



336 
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TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GCG ATC TAC TCA ACT GTC 384 
Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr Val 
115 120 125 

5 GCC AGT TCA CTG GTG CTT TTG GTC TCC CTG GGG GCA ATC AGT TTC TGG 432 
Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe Trp 
130 135 140 

ATG TGT TCT AAT GGA TCT TTG CAG TGC AGA ATA TGC ATC TGA 474 
1.0 Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
145 150 155 

(2) INFORMATION FOR SEQ ID NO: 34: 

15 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 157 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

20 

(li) MOLECULE TYPE; protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

25 Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu Asn - 
^5 10 15 



30 



Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Aid Glu 
20 25 30 

Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp Ser 
35 40 45 



Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn Asn 
35 SO 55 60 

Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys Asp 
^5 70 75 80 

4 0 Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro Lys 

85 90 95 



45 



Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val Lys 
100 105 110 

Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr Val 
115 120 125 



Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe Tro 
50 130 135 140 

Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
145 ISO . 155 . 



55 
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(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

10 (ii) MOLECULE TYPE: DNA ^genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
CATGGATCAT ATGTTAACAG ATATCAAGGC CTGACTGACT GAGAGCT 

15 



(2) INFORMATION FOR SEQ ID NO: 36: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

25 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
30 CTCAGTCAGT CAGGCCTTGA TATCTGTTAA CATATGATC 



(2) INFORMATION FOR SEQ ID NO: 37: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 



40 



45 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
CATGGGCGCC CATATG(3GCA TATTCGGCG 
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(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: unknovm 



10 



15 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
CCGAATATGC CCATATGGGC GCC 23 

(2) INFORMATION FOR SEQ ID NO: 39: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: unknown 

25 <ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CATGGATCAT ATGTTAACAA GTACTCGATA TCAATGAGTG ACTGAA6CT 49 

30 

(2) INFORMATION FOR SEQ ID NO: 40: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

40 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
45 TCAGTCACTC ATTGATATCG AGTACTTGTT AACATATGAT C 41 
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15 



<2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

AATTCGTACC TA 



(2) INFORMATION FOR SEQ ID NO: 42: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
GATCTAGGTA CG 

30 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

40 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
AAACTGTTTG AAAAAACACG TCGTCAACTG CGTGAAAATG CTGACGACAT GGGC 

45 



(2) INFORMATION FOR SEQ ID NO: 44: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 
5 TGTGACAATG CTTGCATCGG TTCAATCCGT AATGGTACTT ATGACCATGA TG 52 

(2) INFORMATION FOR SEQ ID NO: 45: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

15 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:45: 
20 GATCCCGGGT GACTGACTGA 20 



25 



(2) INFORMATION FOR SEQ ID NO: 46: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

35 

GATCTCAGTC AGTCACCCGG 20 
(2) INFORMATION FOR SEQ ID NO: 47: 

.40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

45 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
50 TAAGGAGGAT ATAACATATG 20 
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10 



15 



25 



30 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
GATCCATATG TTATATCCTC CTTAAGGT 

(2) INFORMATION FOR SEQ ID NO: 49: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. <D) TOPOLOGY: unknown 



<ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
GCATCGCCAT GAGTCACGAC G 

(2) INFORMATION FOR SEQ ID NO: 50: 



(i) SEQUENCE .CHARACTERISTICS: 
35 (A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

4 0 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:50: 
GGAGGATGGG AAGGACTCAT TGCAGGTTGG 

45 
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15 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CTCTGCTGTA GAAATCGGTA ACGGTTGCTT TGAAACCAAA C 41 

(2) INFORMATION FOR SEQ ID NO: 52: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

25 (ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



GGTTTCTTGG AAGGTGGTTG GGAAGGTCTC ATTGCAGGTT GGCACGG 47 

30 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 



40 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
GCTTTCCAAC GAAGGTATCA ATCAACAGTG AAGACGAGCA TCTCn:TGG ■ 4B 

45 



(2) INFORMATION FOR SEQ ID NO: 54: 



(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 7616 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 



55 (ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1879.. 2790 

5 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

AATTCTCATG TTTGACAGCT TATCATCGAT AAGCTTCAGT TGAAGATATT AAGAACAGCC 60 

10 TCGCAGATGA. CGAATCATTG GGATTCCCAT CTTTTTTGTT TGTTGAAGGC GACACCATTG 120 

GTTTTGCCAG AACTGTTTTC GGGCCGACCA CATCCGATCT GACAGATTTT TTAATCGGGA 180 

AAGGAATGTC ATTAAGCAGT GGAGAGCGCG TTCAGATAGA GCCACTGATG AGGGGAACCA 240 

15 

CCAAAGACGA TGTTATGCAT ATGCATTTCA TCGGCCGAAC AACGGTGAAG GTAGAAGCCA 300 

AGCTACCTGT ATTTGGCGAT ATATTAAAGG TCTTAGGGGC AACAGATATT GAAGGGGAGC 360 

20 TTTTTGACTC ATTGGATATA GTCATTAAGC CAAAATTTAA AAGGGATATA AAAAAGGTTG 420 

CCAAGGATAT TATTTTTAAC CCGTCACCTC AATTTTCAGA CATTAGCCTG CGGGCAAAAG 480 

ATGAGGCCGG AGATATTTTA ACAGAACATT ATCTATCAGA AAAAGGCCAT CTCTCAGCGC 540 

25 

CTCTGAACAA GGTCACCAAT GCTGAGATAG CTGAAGAGAT GGCATATTGC TACGCAAGAA 600 

TGAAAAGTGA TATACTGGAA TGTTTTAAAA GGCAGGTGGG CAAAGTTAAG GATTAATTAT 660 

30 CAGGAGTAAT TATGCGGAAC AGAATCATGC CTGGTGTTTA CATAGTAATA ATTCCTTACG 720 

TTATCGTAAG CATTTGCTAT CTCCTTTTCC GCCACTACAT TCCTGGTGTT TCTTTTTCAG 780 

CTCATAGAGA TGGTCTTGGG GCGACATTGT CATCATATGC AGGAACCATG ATTGCAATCC 840 

35 

TGATTGCTGC CTTGACGTTT CTAATCGGAA GCAGAACGCG CCGACTGGCC AAGATTAGAG 900 

AGTATGGGTA TATGACATCG GTAGTTATTG TCTATGCCCT TAGTTTTGTT GAGCTTGGAG 960 

40 CTTTGTTTTT CTGCGGGTTA TTGCTTCTTT CCAGCATAAG CGGCTACATG ATACCCACTA 1020 

TCGCCATCGG CATTGCCTCT GCATCGTTCA TTCATATATG CATCCTTGTT TTCCAACTAT 1080 

ATAATTTGAC CAGAGAACAA GAATAACCCG GCCTCAGCGC CGGGTTTTCT TTGCCTCACG 1140 

45 

ATCGCCCCCA AAACACATAA CCAATTGTAT TTATTGAAAA ATAAATAGAT ACAACTCACT 1200 

AAACATAGCA ATTCAGATCT CTCACCTACC AAACAATGCC CCCCTGCAAA AAATAAATTC 1260 

50 ATATAAAAAA CATACAGATA ACCATCTGCG GTGATAAATT ATCTCTGGCG GTGTTGACAT 1320 

AAATACCACT GGCGGTGATA CTGAGCACAT CAGCAGGACG CACTGACCAC CATGAAGGTG 1380 

ACGCTCTTAA AAATTAAGCC CTGAAGAAGG GCAGCATTCA AAGCAGAAGG CTTTGGGGTG 1440 

55 
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10 



TGTGATACGA AACGAAGCAT TGGCCGTAAG TGCGATTCCG GATTAGCTGC CAATGTGCCA 1500 

ATCGCGGGGG GTTTTCGTTC AGGACTACAA CTGCCACACA CCACCAAAGC TAACTGACAG 1560 

GAGAATCCAG ATGGATGCAC AAACACGCCG CCGCGAACGT CGCGCAGAGA AACAGGCTCA 1620 

ATGGAAAGCA GCAAATCCCC TGTTGGTTGG GGTAAGCGCA AAACCAGTTC CGAAAGATTT 1680 

TTT7AACTAT AAACGCTGAT GGAAGCG7TT ATGCGGAAGA GGTAAAGCCC TTCCCGAGTA 1740 

ACAAAAAAAC AACAGCATAA ATAACCCCGC TCTTACACAT TCCAGCCCTG AAAAAGGGCA 1800 

TCAAATTAAA CCACACCTAT GGTGTATGCA TTTATTTGCA TACATTCAAT CAATTGTTAT 1860 

15 CTAAG6AAAT ACTTACAT ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA 1911 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val 
1 5 10 

GAT TGC TTT CTT TGG CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA 1959 
20 Asp Cys Phe Leu Trp His Val Arg Lys Arg Val Ala Asp Gin Glu Leu 
15 20 25 

GGT GAT GCC CCA TTC CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA 2007 
Gly Asp Ala Pro Phe Leu Asp Apg Leu Arg Arg Asp Gin Lys Ser Leu 
25 30 35 40 

AGA GGA AGG GGC AGC ACC CTC GGT CTG GAC ATC GAG ACA GCC ACA CGT 2055 

Arg Gly Arg Gly Ser Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg 
45 50 55 

30 

GCT GGA AAG CAG ATA GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG 2103 

Ala Gly Lys Gin lie Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu 
60 65 70 75 

35 GCA CTT hlUi ATG ACC ATG GGT TTC TTC GGA GCT ATT GCT GGT TTC TTG 2151 
Ala Leu Lys Met Thr Met Gly Phe Phe Gly Ala He Ala Gly Phe Leu 
80 85 90 

GAA GGT GGT TGG GAA GGT CTC ATT GCA GGT TGG CAC GGA TAC ACA TCT 2199 
4 0 Glu Gly Gly Trp Glu Gly Leu He Ala Gly Trp His Gly Tyr Thr Ser 
95 100 105 

CAT GGA GCA CAT GGA GTG GCA GTG GCA GCA GAC CTT AAG AGT ACA CAA 2247 
His Gly Ala His Gly Val Ala Val Ala Ala Asp Leu Lys Ser Thr Gin 
45 110 115 120 

GAA GCT ATA AAC AAG ATA ACA AAA AAT CTC AAC TAT TTA AGT GAG CTA 2295 
Glu Ala He Asn Lys He Thr Lys Asn Leu Asn Tyr Leu Ser Glu Leu 
125 130 135 



50 



55 



GAA GTA AAA AAC CTT CAA AGA CTA AGC GGA GCA ATG AAT GAG CTT CAC 2343 
Glu Val Lys Asn Leu Gin Arg Leu Ser Gly Ala Met Asn Glu Leu His 
140 145 150 155 
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GAC GAA ATA CTC GAG CTA GAG GAA AAA GTG GAT GAT CTA AGA GCT GAT 2391 
Asp Glu lie Leu Glu Leu Asp Glu Lys Val Asp Asp Leu Arg Ala Asp 
160 165 170 

5 ACA ATA AGC TCA CAA ATA GAG CTT GCA GTC TTG CTT TCC AAC GAA GGT 2439 
Thr He Ser Ser Gin lie Glu Leu Ala Val Leu Leu Ser Asn Glu Gly 
175 ' 180 . 185 

ATC ATC AAC AGT GAA GAC GAG CAT CTC TTG GCA CTT GAA AGA AAA CTG 2487 
10 He He Asn Ser Glu Asp Glu His Leu Leu Ala Leu Glu Arg Lys Leu 
190 195 200 

AAG AAA ATG CTT GGC CCC TCT GCT GTA- GAA ATC GGT AAC GGT TGC TTT 2535 
Lys Lys Met Leu Gly Pro Ser Ala Val Glu He Gly Asn Gly Cys Phe 
15 205 210 215 

GAA ACC AAA CAC AAA TGC AAC CAG ACT TGC CTA GAC AGG ATA GCT GCT 2583 

Glu Thr Lys His Lys Cys Asn Gin Thr Cys Leu Asp Arg He Ala Ala . 
220 225 230 235 

20 

GGC ACC TTT AAT GCA GGA GAT TTT TCT CTT CCC ACT TTT GAT TCA TTA 2631 

Gly Thr Phe Asn Ala Gly Asp Phe Ser Leu Pro Thr Phe Asp Ser Leu 
240 245 250 

25 AAC ATT ACT GCT GCA TCT TTA AAT GAT GAT GGC TTG GAT AAT CAT ACT. 2679 
Asn He Thr Ala Ala Ser Leu Asn Asp Asp Gly Leu Asp Ash His Thr 
255 260 265 

ATA CTG CTC TAC TAC TCA ACT GCT GCT TCT AGC TTG GCT GTA ACA TTA 2727 
30 He Leu Leu Tyr Tyr Ser Thr Ala Ala Ser Ser Leu Ala Val Thr Leu 
270 275 280 

ATG ATA GCT ATC TTC ATT GTC TAC ATG GTC TCC AGA GAC AAT GTT TCT 2775 
Met He Ala He Phe He Val Tyr Met Val Ser Arg Asp Asn Val Ser 
35 285 290 295 

TGT TCC ATC TGT CTG TGAGGGAGAT TAAGCCCTGT GTTTTCCTTT ACTGTAGTGC 2830 

Cys Ser He Cys Leu 

300 

40 

TCATTTGCTT GTCACCATTA CAAAGAAACG TTATTGAAAA ATGCTCTTGT TACTACTGAA 2890 

TTCTAGAATC GATAAGCTTC GACCGATGCC CTTGAGAGCC TTCAACCCAG TCAGCTCCTT 2950 

45 CCGGTGGGCG CGGGGCATGA CTATCGTCGC CGCACTTATG ACTGTCTTCT TTATCATGCA 3010 

ACTCGTAGGA CAGGTGCCGG CAGCGCTCTG GGTCATTTTC GGCGAGGACC GCTTTCGCTG 3070 

GAGCGCGACG ATGATCGGCC TGTCGCTTGC GGTATTCGGA ATCTTGCACG CCCTCGCTCA 3130 

50 

AGCCTTCGTC ACTGGTCCCG CCACCAAACG TTTCGGCGAG AAGCAGGCCA TTATCGCCGG 3190 

CATGGCGGCC GACGCGCTGG GCTACGTCTT GCTGGCGTTC GTCCAGTAAT GACCTCAGAA .3250 

55 CTCCATCTGG ATTTGTTCAG AACGCTCGGT TGCCGCCGGG CGTTTTTTAT TGGTGAGAAT 3310 
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CGCAGCAACT TGTCGCGCCA ATCGAGCCAT 
AGCAAGCAGC ATTGAGAACT TTGGAATCCA 

5 

TGGATGGCCT TCCCCATTAT GATTCTTCTC 
CAGGCCATGC TGTCCAGGCA GGTAGATGAC 
10 GCGGCTCTTA CCAGCCTAAC TTCGATCACT 
GCCTCGGCGA GCACATGGAA CGGGTTGGCA 
TGCCTCCCCG CGTTGCGTC6 C6GTGCATGG 

15- 

GGCGGCACCT CGCTAACGGA TTCACCACTC 
GAACTGTGAA TGCGCAAACC AACCCTTGGC 
20 GCAGCCGCAC GCGGCGCATC TCGGGCAGCG 
TGCTCCTGTC GTTGAGGACC CGGCTAGGCT 
AATCACCGAT ACGCGAGCGA ACGTGAAGCG 

25 

CAACAACATG AATGGTCTTC GGTTTCCGTG 
CGCCCTGCAC CATTATGTTC CGGATCTGCA 
30 CACCTACATC TGTATTAACG AAGGGCTGGC 
GCCGCATCCA TACCGCCAGT TGTTTACCCT 
CATCAGTAAC CCGTATCGTG AGCATCCTCT 

35 

ACAGAAATTC CCCCTTACAC 6GAGGCATCA 
CATGGCCCGC TTTATCAGAA GCCAGACATT 
40 CGCGGATGAA CAGGCAGACA TCTGTGAATC 
CAGCTGCCTC GCGCGTTTCG GTGATGACGG 
GACGGTCACA GCTTGTCTGT AAGCGGATGC 

45 

AGCGGGTGTT GGCGGGTGTC GGGGCGCAGC 
GTATACTGGC TTAACTATGC GGCATCAGAG 
50 TGTGAAATAC CGCACAGATG CGTAAGGAGA 
TCGCTCACTG ACTCGCTGCG CTCGGTCGTT 
AAGGCGGTAA TACGGTTATC CACAGAATCA 

55 



GTCGTCGTCA ACGACCCCCC ATTCAAGAAC 3370 

GTCCCTCTTC CACCTGCTGA GACGCGAGGC 3430 

GCTTCCGGCG GCATCGGGAT GCCCGCGTTG 34 90 

GACCATCAGG GACAGCTTCA AGGATCGCTC 3550 

GGACCGCTGA TCGTCACGGC GATTTATGCC 3610 

TGGATTGTAG GCGCCGCCCT ATACCTTGTC 3670 

AGCCGGGCCA CCTCGACCTG AATGGAAGCC 3730 

CAAGAATTGG AGCCAATCAA TTCTTGCGGA 3790 

AGAACATATC CATCGCGTCC GCCATCTCCA 3850 

TTGGGTCCTG GCCACGGGTG CGCATGATCG 3910 

GGCGGGGTTG CCTTACTGGT TA6CAGAATG 3970 

ACTGCTGCTG CAAAACGTCT GCGACCTGAG 4030 

TTTCGTAAAG TCTQGAAACG CGGAAGTCAG 4090 

TCGCAGGATG CTGCTGGCTA CCCTGTGGAA 4150 

ATTGACCCTG AGTGATTTTT CTCTGGTCCC 4210 

CACAACGTTC CAGTAACCGG GCATGTTCAT 4270 

CTCGTTTCAT CGGTATCATT ACCCCCATGA 4330 

AGTGACCAAA CAGGAAAAAA CCGCCCTTAA 43 90 

AACGCTTCTG GAGAAACTCA ACGAGCTGGA 4450 

GCTTCACGAC CACGCTGATG AGCTTTACCG 4510 

TGAAAACCTC TGACACATGC AGCTCCCGGA 4570 

CGGGAGCAGACAAGCCCGTC AGGGCGCGTC 4630 

CATGACCCAG TCACGTAGCG ATAGCGGAGT 4 690 

CAGATTGTAC TGAGAGTGCA CCATATGCGG 4750 

AAATACCGCA TCAGGC6CTC TTCCGCTTCC 4810 

CGGCTGCGGC GAGCGGTATC AGCTCACTCA 4870 

GGGGATAACG CAGGAAAGAA CATGTGAGCA 4930 
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AAAGGCCAGC AAAAGGCCAG GAACCGTAAA AAGGCCGCGT TGCTGGCGTT . TTTCCATAGG 4 990 

CTCCGCCCCC CTGACGAGCA TCACAAAAAT CGACGCTCAA GTCAGAGGTG GCGAAACCCG 5050 

ACAGGACTAT AAAGATACCA GGCGTTTCCC CCTGGAAGCT CCCTCGTGCG CTCTCCTGTT 5110 

CCGACCCTGC CGCTTACCGG ATACCTGTCC GCCTTTCTCC CTTCGGGAAG CGTGGCGGTT 5170 

TCTCAATGCT CACGCTGTAG GTATCTCAGT TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC 5230 

TGTGTGCACG AACCCCCCGT TCAGCCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTCTT 5290 

GAGTCCAACC CGGTAAGACA CGACTTATCG CCACTGGCAG CAGCCACTGG TAACAGGATT 5350 

15 AGCAGAGCGA GGTATGTAGG CGGTGCTACA GAGTTCTTGA AGTGGTGGCC TAACTACGGC 5410 

TACACTAGAA GGACAGTATT TGGTATCTGC GCTCTGCTGA AGCCAGTTAC CTTCGGAAAA 5470 

AGAGTTGGTA GCTCTTGATC CGGCAAACAA ACCACCGCTG GTAGCGGTGG TTTTTTTGTT 5530 

20 

TGCAAGCAGC AGATTACGCG CAGAAAAAAA GGATCTCAAG AAGATCCTTT GATCTTTTCT 5590 

ACGGGGTCTG ACGCTCAGTG GAACGAAAAC TCACGTTAAG GGATTTTGGT CATGAGATTA 5650 

25 TCAAAAAGGA TCTTCACCTA GATCCTTTTA AATTAAAAAT GAAGTTTTAA ATCAATCTAA 5710 

AGTATATATG AGTAAACTTG GTCTGACAGT TACCAATGCT TAATCAGTGA GGCACCTATC 5770 

TCAGCGATCT GTCTATTTCG TTCATCCATA GTTGCCTGAC TCCCCGTCGT GTAGATAACT 5830 

30 

ACGATACGGG AGGGCTTACC ATCTGGCCCC AGTGCTGCAA TGATACCGCG AGACCCACGC 5890 

TCACCGGCTC CAGATTTATC AGCAATAAAC CAGCCAGCCG GAAGGGCCGA GCGCAGAAGT 5950 

35 GGTCCTGCAA CTTTATCCGC CTCCATCCAG TCTATTAATT GTTGCCGGGA AGCTAGAGTA 6010 

AGTAGTTCGC CAGTTAATAG TTTGCGCAAC GTTGTTGCCA TTGCTGCAGG CATCGTGGTG 6070 

TCACGCTCGT CGTTTGGTAT GGCTTCATTC AGCTCCGGTT CCCAACGATC AAGGCGAGTT 6130 

40 

ACATGATCCC CCATGTTGTG CAAAAAAGCG GTTAGCTCCT TCGGTCCTCC GATCGTTGTC 6190 

AGAAGTAAGT TGGCCGCAGT GTTATCACTC ATGGTTATGG CAGCACTGCA TAATTCTCTT 6250 

45 ACTGTCATGC CATCCGTAAG ATGCTTTTCT GTGACTGGTG AGTAGCTTCA CGCTGCCGCA 6310 

AGCACTCAGG.GCGCAAGGGC TGCTAAAGGA AGCGGAACAC GTAGAAAGCC AGTCCGCAGA 6370 

AACGGTGCTG ACCCCGGATG AATGTCAGCT ACTGGGCTAT CTGGACAAGG GAAAACGCAA 6430 

50 

GCGCAAAGAG AAAGCAGGTA 6CTTGCAGTG GGCTTACATG GCGATAGCTA GACTGGGCGG 6490 

TTTTATGGAC AGCAAGCGAA CCGGAATTGC CAGCTGGGGC GCCCTCTGGT AAGGTTGGGA 6550 

55 AGCCCTGCAA AGTAAACTGG ATGGCTTTCT TGCCGCCAAG GATCTGATGG CGCAGGGGAT 6610 

.106- 



CAAGATCTGA TCAAGAGACA GGATGAGGAT CGTTTCGCAT GATTGAACAA GATGGATTGC 
ACGCAGGTTC TCCGGCCGCT TGGGTGGAGA GGCTATTCGG CTATGACTGG GCACAACAGA 
CAATCGGCTG CTCTGATGCC GCCGTGTTCC GGCTGTCAGC GCAGGGGCGC CCGGTTCTTT 
TTGTCAAGAC CGACCTGTCC GGTGCCCTGA ATGAACTGCA GGACGAGGCA GCGCGGCTAT 
CGTGGCTGGC CACGACGGGC GTTCCTTGCG CAGCTGTGCT CGACGTTGTC ACTGAAGCGG 
GAAGGGACTG GCTGCTATTG GGCGAAGTGC CGGGGCAGGA TCTCCTGTCA TCTCACCTTG 
CTCCTGCCGA GAAAGTATCC ATCATGGCTG ATGCAATGCG GCGGCTGCAT ACGCTTGATC 
CGGCTACCTG CCCATTCGAC CACCAAGGGA AACATCGCAT CGAGCGAGCA CGTACTCGGA 
TGGAAGCCGG TCTTGTCGAT CAGGATGATC TGGACGAAGA GCATCAGGGG CTCGCGCCAG 
CCGAACTGTT CGCCAGGCTC AAGGCGCGCA TGCCCGACGG CGAGGATCTC GTCGTGACTC 
ATGGCGATGC CTGCTTGCCG , AAT ATCATGG TGGAAAATGG CCGCTTTTCT GGATTCATCG 
ACTGTGGCCG GCTGGGTGTG GCGGACCGCT ATCAGGACAT AGCGTTGGCT ACCCGTGATA 
TTGCTGAAGA GCTTGGCGGC GAATGGGCTG ACCGCTTCCT CGTGCTTTAC GGTATCGCCG 
CTCCCGATTC GCAGCGCATC GCCTTCTATC GCCTTCTTGA CGAGTTCTTC TGAGCGGGAC 
TCTGGGGTTC GAAATGACCG ACCAAGCGAC GCCCAACCTG CCATCACGAG ATTTCGATTC 
CACCGCCGCC TTCTATGAAA GGTTGGGCTT CGGAATCGTT TTCCGGGACG CCGGCTGGAT 
GATCCTCCAG CGCGGGGATC TCATGCTGGA GTTCTTCGCC CACCCC 

(2) INFORMATION FOR SEQ ID NO: 55: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 . 10 ' 15 

His Val Arg Lys Arg Val Ala Asp Gin* Glu Leu Gly Asp Ala Pro Phe 
20 25 30 
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Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 

35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 

5 50 55 60 

Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 

65 70 75 80 

10 Met Gly Phe Phe Gly Ala He Ala Gly Phe Leu Glu Gly Gly Trp Glu 

85 ' 90 95 



15 



Gly Leu He Ala Gly Trp His Gly Tyr Thr Ser His Gly Ala His Gly 
100 105 110 . 

Val Ala Val Ala Ala Asp Leu Lys Ser Thr Gin Glu Ala He Asn Lys 
115 120 125 



20 



lie Thr Lys Asn Leu Asn Tyr Leu Ser Glu Leu Glu Val Lys Asn Leu 
130 135 140 



Gin Arg Leu Ser Gly Ala Met Asn Glu Leu His Asp Glu He Leu Glu 
145 150 155 160 

25 Leu Asp Glu Lys Val Asp Asp Leu Arg Ala Asp Thr He Ser Ser Gin 

165 170 175 



30 



He Glu Leu Ala Val Leu Leu Ser Asn Glu Gly He He Asn Ser Glu 
180 .185 190 

Asp Glu His Leu Leu Ala Leu Glu Arg Lys Leu Lys Lys Met Leu Gly 
195 200 205 



35 



Pro Ser Ala Val Glu He Gly Asn Gly Cys Phe Glu Thr Lys His Lys 

210 215 220 



Cys Asn Gin Thr Cys Leu Asp Arg He Ala Ala Gly Thr Phe Asn Ala 

225 . 230 235 240 

4 0 Gly Asp Phe Ser Leu Pro Thr Phe Asp Ser Leu Asn He Thr Ala Ala 

245 250 255 



45 



Ser Leu Asn Asp Asp Gly Leu Asp Asn His Thr He Leu Leu Tyr Tyr 
260 265 270 

Ser Thr Ala Ala Ser Ser Leu Ala Val Thr Leu Met He Ala .He Phe 
275 280 285 



50 



He Val Tyr Met Val Ser Arg Asp Asn Val Ser Cys Ser He Cys Leu 
290 295 300 
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(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 915 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 



10 



(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

<B) LOCATION: 1..912 



15 



20 



25 



30 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 4 8 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACC CTC GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT GGA AAG CAG ATA 192 
Thr Leu Gly Leu Asp He Glu. Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 



40 



45 



GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 240 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

ATG GGT TTC TTC GGA GCT ATT GCT GGT TTC TTG GAA GGA GGA TGG GAA 288 
Met Gly Phe Phe Gly Ala He Ala Gly Phe Leu Glu Gly Gly Trp Glu 
85 . 90 95 

.GGA ATG ATT GCA GGT TGG CAC GGA TAC ACA TCT CAT GGA GCA CAT GGA 336 
Gly Met He Ala Gly Trp His Gly Tyr Thr Ser His Gly Ala His Gly 
100 105 110 

GTG GCA GTG GCA GCA GAC CTT AAG AGT ACA CAA GAA GCT ATA AAC AAG 384 
Val Ala Val Ala Ala Asp Leu Lys Ser Thr Gin Glu Ala He Asn Lys 

115 120 125 



ATA ACA AAA AAT CTC AAC TAT TTA AGT GAG CTA GAA GTA AAA AAC CTT 
50 He Thr Lys Asn Leu Asn Tyr Leu Ser Glu Leu Glu Val Lys Asn Leu 
130 135 140 



432 
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CAA AG A CTA AGC GGA GCA ATG AAT GAG CTT CAC GAC GAA ATA- CTC GAG 480 
Gin Arg Leu Ser Gly Ala Met Asn Glu Leu His Asp Glu lie Leu Glu 
145 150 155 160 

CTA GAC GAA AAA GT6 GAT GAT CTA AGA GCT GAT ACA ATA AGC TCA CAA 528 
Leu Asp Glu Lys Val Asp Asp Leu Arg Ala Asp Thr He Ser Ser Gin 
165 170 175 



ATA GAG CTT GCA GTC TTG CTT TCC AAC GAA GGG ATA ATA AAC AGT GAA 
10 He Glu Leu Ala Val Leu Leu Ser Asn Glu Gly He He Asn Ser Glu 
180 185 190 



576 



15 



GAT GAG CAT CTC TTG GCA CTT GAA AGA AAA CTG AAG AAA. ATG CTT GGC 
Asp Glu His Leu Leu Ala Leu Glu Arg Lys Leu Lys Lys Met Leu Gly 
195 200 205 



624 



20 



CCC TCT GCT GTA GAA ATA GGG AAT GGG TGC TTT GAA ACC AAA CAC AAA 672 
Pro Ser Ala Val Glu He Gly Asn Gly Cys Phe Glu Thr Lys His Lys 
210 215 220 

TGC AAC CAG ACT TGC CTA GAC AGG ATA GCT GCT GGC ACC TTT AAT GCA 720 
Cys Asn Gin Thr Cys Leu Asp Arg He Ala Ala Gly Thr Phe Asn Ala 
225 230 235 240 



25 GGA GAT TTT TCT CTT CCC ACT TTT GAT TCA TTA AAC ATT ACT GCT GCA 
Gly Asp Phe Ser Leu Pro Thr Phe Asp Ser Leu Asn He Thr Ala Ala 
245 250 255 



768 



TCT TTA AAT GAT GAT GGC TTG GAT AAT CAT ACT ATA CTG CTC TAC TAC 
30 Ser Leu Asn Asp Asp Gly Leu Asp Asn His Thr He Leu Leu Tyr Tyr 
260 265 270 



816 



35 



TCA ACT GCT GCT TCT AGC TTG GCT GTA ACA TTA ATG ATA GCT ATC TTC 
Ser Thr Ala Ala Ser Ser Leu Ala Val Thr Leu Met He Ala He Phe 
275 280 285 



864 



40 



ATT GTC TAC ATG GTC TCC AGA GAC AAT GTT TCT TGT TCC ATC TGT CTG 912 

He Val Tyr Met Val. Ser Arg Asp Asn Val Ser Cys Ser He Cys Leu 
290 295 300 

TGA 915 



45 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 amino acids 
50 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

55 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

5 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 

20 • 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
10 35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 • 55 60 

15 Val Glu Arg lie Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

Met Gly Phe Phe Gly Ala He Ala Gly Phe Leu Glu Gly Gly Trp Glu 
85 90 95 

20 

Gly Met He Ala Gly Trp His Gly Tyr Thr Ser His Gly Ala His Gly 
100 105 110 

Val Ala Val Ala Ala Asp Leu Lys Ser Thr Gin Glu Ala He Asn Lys 
25 115 120 125 

He Thr Lys Asn Leu Asn Tyr Leu Ser Glu Leu Glu Val Lys Asn Leu 
• 130 135 140 

30 Gin Arg Leu Ser Gly Ala Met Asn Glu Leu His Asp Glu He Leu Glu 
145 150 155 160 



35 



Leu Asp Glu Lys Val Asp Asp Leu Arg Ala Asp Thr He Ser Ser Gin 
165 170 175 

He Glu Leu Ala Val Leu Leu Ser Asn Glu Gly He He Asn Ser Glu 
180 185 190 



Asp Glu His Leu Leu Ala Leu Glu Arg Lys Leu Lys Lys Met Leu Gly 

40 195 200 205 

Pro Ser Ala Val Glu He Gly Asn Gly Cys Phe Glu Thr Lys His Lys 

210 215 220 

45 Cys Asn Gin Thr Cys Leu Asp Arg He Ala Ala Gly Thr Phe Asn Ala 

225 230 235 240 



50 



Gly Asp Phe Ser Leu Pro Thr Phe Asp Ser Leu Asn He Thr Ala Ala 
245 250 255 

Ser Leu Asn Asp Asp Gly Leu Asp Asn His Thr He Leu Leu Tyr Tyr 

260 265 270 



Ser Thr Ala Ala Ser Ser Leu Ala Val Thr Leu Met He Ala He Phe 
55 275 280 285 
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lie Val Tyr Met Val Ser Arg Asp Asn Val Ser Cys Ser lie Cys Leu 
290 295 300 

5 



(2) INFORMATION FOR SEQ ID NO: 58: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 918 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

15 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
20 ATGGATCCAA ACACTGTGTC AAGCTTTCAG GTAGATTGCT TTCTTTGGCA TGTCCGCAAA 60 
CGAGTTGCAG ACCAAGAACt AGGTGATGCC CCATTCCTTG ATCGGCTTCG CCGAGATCAG 120 
AAATCCCTAA GAGGAAGGGG CAGCACTCTT GGTCTGGACA TCGAGACAGC CACACGTGCT 180 

25 

GGAAAGCAGA TAGTGGAGCG GATTCTGAAA GAAGAATCCG ATGAGGCACT TAAAATGACC 240 
ATGGGCGCCC ATATGG6CAT ATTCGGCGCA ATAGCAGGTT TCATAGAAAA TGGTTGGGAG 300 
30 GGAATGATAG ACGGTTGGTA CGGTTTCAGG CATCAAAATT CTGAGGGCAC AGGACAAGCA 360 
GCAGATCTTA AAAGCACTCA AGCAGCCATC GACCAAATCA ATGGGAAACT GAATAGGGTA 420 
ATCGAGAAGA CGAACGAGAA ATTCCATCAA ATCGAAAAGG AATTCTCAGA AGTAGAAGGG 4 80 

35 

AGAATTCAGG ACCTCGAGAA ATACGTTGAA GACACTAAAA TAGATCTCTG GTCTTACAAT 540 
GCGGAGCTTC TTGTCGCTCT GGAGAACCAA CATACAATTG ATCTGACTGA CTCGGAAATG 600 
40 AACAAACTGT TTGAAAAAAC ACGTCGTCAA CTGCGTGAAA ATGCTGAGGA CATGGGCAAT 650 
GGTTGCTTCA AAATATACCA CAAATGTGAC AATGCTTGCA TAGGGTCAAT CAGAAATGGG 720 
ACTTATGACC ATGATGTATA CAGAGACGAA GCATTAAACA ACCGGTTTCA GATCAAAGGT 780 

45 

GTTGAACTGA AGTCAGGATA CAAAGACTGG ATCCTGTGGA TTTCCTTTGC CATATCATGC 840 
TTTTTGCTTT GTGTTGTTTT GCTGGGGTTC ATCATGTGGG CCTGCCAAAA. AGGCAACATT 900 
50 AGGTGCAACA TTTGCATT 918 
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(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 221 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Gly He Phe Gly Ala He Ala Gly Phe He Glu Asn Gly Trp Glu Gly 
15 1 5 10 15 

Met He Asp (Sly Trp Tyr Gly Phe Arg His Gin Asn Ser Glu Gly Thr 
20 25 . 30 

20 Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala He Asp Gin He 
35 40 45 

Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr Asn Glu Lys Phe His 
50 55 60 

25 

Gin He Glu Lys Glu Phe Ser Glu Val Glu Gly Arg He Gin Asp Leu 
^5 70 75 80 

Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp Ser Tyr Asn Ala 
30 85 90 95 

Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr Asp 
100 105 110 

35 Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 
115 120 125 

Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys Cys 
130 135 140 

40 

Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thi: Tyr Asp His Asp 
145 150 155 160 

Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 
45 165 170 175 

Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe Ala 

180 185 190' 

50 He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met Trp 
195 200 205 

Ala Cys Gin Lys Gly Asn He Arg Cys Aso He Cys He 
210 215 220 

55 
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(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

15 Gly lie Phe Gly Ala lie Ala Gly Phe He Glu Asn Gly Trp Glu Gly 
1 5 10 / 15 

Met He Asp Gly Trp Tyr Gly Phe Arg His Gin Asn Ser Glu Gly Thr 
20 25 30 

Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala He Asp Gin He 
35 40 45 



20 



Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr Asn Glu Lys Phe His 
25 50 55 60 

Gin He Glu Lys Glu Phe Ser Glu Val Glu Gly Arg He Gin Asp Leu 

65 70 75 80 

30 Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp Ser Tyr Asn Ala 

85 90 95 



35 



Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr Asp 

100 105 110 

Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 

115 120 125 



Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys Cys 
40 130 135 140 

Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thr Tyr Asp His Asp 

145 150 155 160 

45 Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 

165 170 175 



50 



Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe Ala 

180 185 190 

He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met Trp 
195 200 205. 
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Ala Cys Gin Lys Gly Asn lie Arg Cys Asn lie Cys lie 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: unknown 

15 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Arg Arg Xaa Xaa Arg 
20 1 5 



25 (2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE. TYPE: DNA (genomic) 
35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

CGNCGNNNNN NNCGN 



40 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 15 base pairs 

<B) TYPE: nucleic, acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

50 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
55 AGRAGRNNNN NNAGR 
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5 (2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 47 base pairs 
<B) TYPE: nucleic acid 
10 (C) STEtANDEDNESS : single 

(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE: DNA (genomic) 

15 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

CATGGATCAT ATGTTAACAG ATATCAAGGC CTGACTGACT GAGAGCT 



20 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 39 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

30 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
CTCAGTCAGT CAGGCCTTGA TATCTGTTAA CATATGATC 

35 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 66: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
45 (D) TOPOLOGY: unknown 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
CATGGGCGCC CATATGGGCA TATTCGGCG 



55 
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10 



15 



20 



30 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 67: 

<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
CCGAATATGC CCATATGGGC GCC 

(2) INFORMATION FOR SEQ ID NO: 68: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic a[cid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

AAACTGTTTG AAAAAACACG TCGTCAACTG CGTGAAAATG CTGACGACAT GGGC 



35 (2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 145 amino acids 

(B) TYPE: amino acid 

40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

He Gin Asp Leu Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp 
1 5 10 15 



Ser. Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He 

20 25 30 

Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg 
35 40 45 
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Gin Leu Arg Glu Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys lie 
50 55 60 

Tyr His Lys Cys Asp Asn Ala Cys lie Gly Ser lie Arg Asn Gly Thr 

65 70 75 80 

Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin 
85 90 95 



10 



He Lys Gly Val Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu Trp 
100 . 105 . 110 



15 



He Ser Phe Ala He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly 
115 120 125 

Phe He Met Trp Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys 
130 135 140 



20 



lie 
145 



25 (2) INFORMATION FOR SEQ ID NO:70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 145 amino acids 

(B) TYPE: ainino acid 

30 (C) ' STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

He Gin Asp Leu Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp 
15 10 15 

40 Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He 

20 25 30 



45 



Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg 
35 40 45 

Gin Leu Arg Glu Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He 
50 55 60 



Tyr His Lys Cys Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thr 

50 65 70 75 80 

Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin 
85 90 95 
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He Lys Gly Val Glu Leu Lys Ser Gly Tyr Lyis Asp Trp He Leu Trp 
100 105 110 

He Ser Phe Ala He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly 
5 115 120 125 

Phe He Met Trp Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys 
130 135 140 

10 . He 
145 



15 



(2) INFORMATION FOR SEQ ID NO: 71: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 690 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double " 

(D) TOPOLOGY: unknown 



25 



(ii) MOLECULE TYPE: DNA (genomic) 



30 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..690 



35 



40 



45 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT GGA AAG CAG ATA 
Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 .. 80 



48 



96 



144 



192 



240 
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ATG GAT CAT ATG TTA ATT CAG GAC CTC GAG AAA TAG GTT GAA GAC ACT 
Met Asp His Met Leu lie Gin Asp Leu Glu Lys Tyr Val Glu Asp Thr 
85 90 95 



286 



AAA ATA GAT CTC TGG TCT TAC AAT GCG GAG CTT CTT GTC GCT CTG GAG 
Lys lie Asp Leu Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu 
100 105 110 



336 



10 /U^C. CAA CAT ACA ATT GAT CTG ACT GAC TCG GAA ATG AAC AAA CTG TTT 
Asn Gin His Thr lie Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe 
115 120 125 



384 



15 GAA AAA ACA AGG AGG CAA CTG AGG GAA AAT GCT GAG GAC ATG GGC AAT 
Glu Lys Thr Arg Arg Gin Leu Arg Glu Asn Ala Glu Asp Met Gly Asn 
130 135 140 



432 



20 GGT TGC TTC AAA ATA TAC CAC AAA TGT GAC AAT GCT TGC ATA GGG TCA 
Gly Cys Phe Lys He Tyr His Lys Cys Asp Asn Ala Cys He Gly Ser 
145 150 155 160 



480 



25 ATC AGA AAT GGG ACT TAT GAC CAT GAT GTA TAC AGA GAC GAA GCA TTA 
He Arg Asn Gly Thr Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu 
165 170 175 



528 



30 AAC AAC CGG TTT CAG ATC AAA GGT GTT GAA CTG AAG TCA GGA TAC AAA 
Asn Asn Arg Phe Gin He Lys Gly Val Glu Leu Lys Ser Gly Tyr Lys 
180 185 190 



576 



35 GAC TGG ATC CTG TGG ATT TCC TTT GCC ATA TCA TGC TTT TTG CTT TGT 
Asp Trp He Leu Trp He Ser Phe Ala He Ser Cys Phe Leu Leu Cys 
195 200 205 



624 



40 GTT GTT TTG CTG GGG TTC ATC ATG TGG GCC TGC CAA AAA GGC AAC ATT 
Val Val Leu Leu Gly Phe He Met Trp Ala Cys Gin Lys Gly Asn He 
210 215 220 



672 



45 AGG TGC AAC ATT TGC ATT 
Arg Cys Asn He Cys He 
225 230 



690 



50 
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(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 amino acids 
5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

' (ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 . 15 

15 His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 



20 



Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 

35 40 45 

Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin lie 
50 55 60 



Val Glu Arg lie Leu Lys Glu Gloi Ser Asp Glu Ala Leu Lys Met Thr 
25 65 70 75 80 

Met Asp His Met Leu lie Gin Asp Leu Glu Lys Tyr Val Glu Asp Thr 
85 90 95 

30 Lys lie Asp Leu Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu 
100 105 110 

Asn Gin. His Thr He Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe 
115 120 125 

35 

Glu Lys Thr Arg Arg Gin Leu Arg Glu Asn Ala Glu Asp Met Gly Asn 
130 135 140 

Gly Cys Phe Lys He Tyr His Lys Cys Asp Asn Ala Cys He Gly Ser 
40 145 150 155 160 

He Arg Asn Gly Thr Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu 
165 170 175 

45 Asn Asn Arg Phe Gin He Lys Gly Val Glu Leu Lys Ser Gly Tyr Lys 
180 185 190 

Asp Trp He Leu Trp He Ser Phe Ala He Ser Cys Phe Leu Leu Cys 
195 200 .205 

50 

Val Val Leu Leu Gly Phe He Met Trp Ala Cys Gin Lys Gly Asn He 
210 215 220 
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Arg Cys Asn lie Cys He 
225 230 
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WHAT IS CLAIMED IS: 

1. A vaccine for stimulating protection in animals against 
infection by influenza virus which comprises an effective amount of an 

5 immunogenic fragment of the HA2 subunit of an HA protein selected from the 
group consisting of a Type A subtype influenza virus and a Type B influenza virus. 

2. The vaccine according to claim 1 wherein said Type A 
subunit isH3N2. 

10 

3. The vaccine according to claim 1 wherein the polypeptide is 
fused to a second polypeptide. 



4. The vaccine according to claim 3 wherein the second 
15 polypeptide comprises the N terminal amino acids of influenza NS 1 protein. 



5. The vaccine according to claim 1 wherein the immunogenic 
fragment of the HA2 subunit is selected fit)m the group consisting of a peptide 
comprising amino acids 1 to 221 of the H3HA2 subtype, a peptide comprising 

20 amino acids 77 to 221 of the H3HA2 subtype, a peptide comprising amino acids 1 to 
223 of the BHA2 Type, and a peptide comprising amino acids 41 to 223 of the 
BHA2type. 

6. The vaccine according to claim 5 comprising NS 1 ( ] . 
25 8i)H3HA2(i.221) SEQ ID NO: 10. 

7. The vaccine according to claim 5 comprising NS 1 ( i . 
8l)H3HA2(77.221) SEQ ID NO: 12. 

30 8. The vaccine according to claim 5 comprising NS 1 (i . 

42)BLHA2(4i.223) SEQ ID NO: 14. 

9. * The vaccine according to claim 5 comprising NS 1 ( i . 
8l)BLHA2(i.223) SEQ ID NO: 57. 

35 
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10. The vaccine according to claim 5 comprising NSl(i. 
81)H3HA2(i.221) SEQ ID NO:10 and NSl(i.81)BLHA2(i.223)(niet-leu) SEQ ID 
NO: 55. 

5 1 1 . A protein comprising an immunogenic fragment of the HA2 

subunit of an HA protein selected from the group consisting of Type A subtype or 
Type B influenza virus. 

12. The protein according to claim 1 1 wherein said Type A 
10 subtype is H3N2. 

13. The protein according to claim 1 1 wherein the peptide 
containing the immunogenic fragment is fused to a second peptide or protein. 

15 14. The protein according to claim 13 wherein the second peptide 

comprises the N terminal amino acids of a NSl protein. 

IS. The protein according to claim 1 1 whmin the immunogenic 
fragment of the HA2 subunit is selected from the group consisting of a peptide 
20 comprising amino acids 1 to 221 of the H3HA2 subunit, a peptide comprising amino 
acids 77 to 221 of the H3HA2 subunit, a peptide comprising amino acids 1-223 of 
the BHA2 subunit, and a peptide comprising amino acids 41-223 of the BHA2 
subunit. 

25 16. A polypeptide NSl(i.8i)H3HA2(i.221) SEQ ID NO: 10. 

17. A polypeptide NSl(i-8i)H3HA2(77.221) SEQ ID NO: 12. 

18. A polypeptide NSl(i-4i)BLHA2(4i-223) SEQ ID NO: 14. 

19. A polypeptide NSl(i.81)BLHA2(i-223) SEQ ID NO: 57. 

20. A polypeptide NSl(i-81)BLHA2(i.223)(met-leu) SEQ ID 



30 



NO: 55. 



35 
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21, A DNA molecule comprising a coding sequence for an 
immunogenic fragment of the HA2 subuhit of an HA protein selected from the 
group consisting of a Type A subtype or Type B influenza virus. 

5 22. The DNA molecule according to claim 21 wherein said Type 

A subunit is H3N2. 

23. The DNA molecule according to claim 22 comprising a coding 
sequence for the polypeptide NSl(i-8i)H3HA2(i.221) SEQ ID NO: 10. 

10 

24. The DNA molecule according to claim 21 comprising a 
coding sequence for the polypeptide NSl(i.42)BLHA2(4i-223) SEQ ID NO: 14. 

25. The DNA molecule according to claim 21 comprising a 

15 coding sequence for the polypeptide NSl(i-81)H3HA2(77.221) SEQ ID NO: 12. 

26. The DNA molecule according to claim 21 comprising a 
coding sequence for the polypeptide NSl(i-8i)BLHA2(i.223) SEQ ID NO: 57. 

20 27. A vector pOTS208NS 1 BLmut2 SEQ ID NO: 54. 

28. A microorganism transformed with a DNA molecule 
comprising a coding sequence for an immunogenic fragment of the HA2 subunit of 
an HA protein selected fiom the group consisting of a Type A subtype or Type B 

25 influenza virus. 

29. The microorganism according to claim 28 wherein said Type 
A subunit is H3N2. 

30 30. The microorganism according to claim 28 wherein said DNA 

molecule comprises a coding sequence for the polypeptide NSl(i-81)H3HA2(i. 
221) SEQ ID NO: 10. 

31. The microorganism according to claim 28 wherein said DNA 
35 molecule comprises a coding sequence for the polypeptide NSl(i-81)BLHA2(i. 
223) SEQ ID NO: 57. 
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32. The microorganisin according to claim 28 wherein said DN A 
molecule comprises a coding sequence for the polypeptide NSl(i-8i)BLHA2(i- 
223)(met-leu) SEQ ID NO: 55. 

5 33. A combination vaccine for stimulating protection in animals 

against infection by influenza virus which comprises a first polypeptide having an 
immunogenic fragment of the HA2 subunit of an influenza H3 subtype virus and a 
second polypeptide selected from the group consisting of a polypeptide having an 
immunogenic fragment of the HA2 subunit of a Type B influenza virus, and a 
10 polypeptide having an immunogenic fragment of the HA2 subunit of an HI subtype 
influenza virus, and a polypeptide having an immunogenic fragment of the HA2 
subunit of an H2 subtype influenza virus. 

34. The combination vaccine according to claim 33 wherein the 
15 first polypeptide is selected from the group consisting of NSl(i-8i)H3HA2(i-221) 

SEQ ID NO: 10 and NSl(i,81)H3HA2(77.221) SEQ ID NO: 12. 

35. The combination vaccine according to claim 33 wherein the 
second polypeptide is a polypeptide having an immunogenic fragment of the HA2 

20 subunit of an HI subtype influenza virus. 

36. The combination vaccine according to claim 33 wherein said 
second polypeptide is selected from the group consisting of C13 SEQ ID NO: 16, D 
SEQ ED NO: 18, C13 short SEQ ID NO: 20, D short SEQ ID NO: 22, A SEQ ID 

'25 NO: 24, C SEQ ED NO: 26, AD SEQ ID NO: 27, A13 SEQ ID NO: 28, M SEQ ID 
NO: 29, AM SEQ ID NO: 30, AM+ SEQ ID NO: 32, and H1HA266-222 SEQ ID 
NO: 34. 

37. The combination vaccine according to claim 33 wherein said 
30 second polypeptide is NSl(i.42)BLHA2(4i-223) SEQ ID NO: 14. 

38. The combination vaccine according to claim 33 wherein said 
second polypeptide is NSl(i-81)BLHA2(i-223) SEQ ED NO: 57. 

35 39. A combination vaccine for stimulating protection in animals 

against infection by influenza virus which comprises a first polypeptide having an 
immunogenic fragment of the HA2 subunit of an influenza H3 subtype virus, a 
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second polypeptide having an immunogenic fragment of the HA2 subunit of an 
influenza B Type virus, and a third polypeptide selected from the group consisting 
of a polypeptide having an immunogenic fragment of the HA2 subunit of an HI 
subtype influenza virus and a polypeptide having an immunogenic fragment of the 
5 HA2 subunit of an H2 subtype influenza virus. 

40. The combination vaccine according to claim 39 wherein the 
first polypeptides is NSl(i.81)H3HA2(i.221) SEQ ID NO: 10, the second 
polypeptide is NSl(i.8i)BHA2(i-223)(met-leu) SEQ ID NO: 57, and the third 
10 . polypeptide is NSl(i-8i)HA2(65-222) SEQ ID NO: 18. 



1 
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FIGURE 1 

(a) — 

(b) 

(c) — tc 1- -a — c— t— c 1— t — ggg~a act 

(d) GGCATATTCG GCGCAATAGC AGGTTTCATA GAAAATGGTT GGGAGGGAAT 50 

(a) 1 

ib) c 

(c) 1 — a atcat g gaac — — at ct 

(d) GATAGACGGT TGGTACGGTT TCAGGCATCA AAATTC-GAG GGCACAGGAC 100 

(a) 

(b) 

(c) -t g— — aa — a — aat- ta — gg g — t-caaac 

(d) AAGCAGCAGA TCTTAAAAGC ACTCAAGCAG CCATCGACCA AATCAATGGG 150 

(a) 

(b) . 

(c) gg ct ct~ t ^-a-t attc a cagctg-g-g 

(d) AAACTGAATA GGGTAATCGA GAAGACGAAC GAGAAATTCC ATCAAATCGA 200 

(a) - - 

(b) 

(c) t — a aaca — t — aaa — g — gg-aa-tt-ra a-t a-a- 

(d) AAAG6AATTC TCAGAAGTAG AAGGGAGAAT TCAGGACCTC GAGAAATACG 250 

(a) - 

(b) 

(c) 1— tgg atttc-g--c a-t a-a- -t a— at-gt-a— t 

<d) TTGAAGACAC TAAAATAGAT CTCTGGTCTT ACAATGCGGA GCTTCTTGTC 300 

(a) 

(b) 

(c) eta a tg — agg — tc-g t-c ca aa -tg g — 

(d) GCTCTGGAGA ACCAACATAC AATTGATCTG ACTGACTCGG AAATGAACAA 350 

(a) 

(b) 

(c) t a g gt— aa c t-a-a -a-t c a-a— a — c- 

{d) ACTGTTTGAA AAAACAAGGA GGCAACTGAG GGAAAATGCT GAGGACATGG 400 

U) - - ' 

(b) - 

(c) -a a~ t~tg-gt-c g a a g-aa 

(d) GCAATGGTTG CTTCAAAATA TACCACAAAT GTGACAATGC TTGCATAGGG 450 

(a) 

<b) . 

(c) agtg-a tt — ccc aa ttc a — gt— aa 

(d) TCAATCAGAA ATGGGACTTA TGACCATGAT GTATACAGAG ACGAAGCATT 500 

(a) - 

(b) ' : 

(c) gttg a — gaaa — g-ag -t — a-^ga 1 — g-a atgggg-tct 

<d) AAACAACCGG TTTCAGATCJ* AAGGTGTTGA ACTGAAGTCA GGATACAAAG 550 
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FIGURE l' {cont'd) 

(a) 

(15) 

(c) -tea — t — -gc — c-a- -caa-tg-cg -ca-t-cac- -g-gct-t-g . 

(d) ACTGGATCCT GTGGATTTCC TTTGCCATAT CATGCTTTTT GCTTTGTGTT 600 

(a) . . . : g 

(b) . . . a 

(c) — -c-cc ~gca g tt-c — atg —ttct— t — atctt-gca 

(d) GTTTTGCTGG GGTTCATCAt —TGTGGGCC TGCCA-AAAG GCAACATTAG 650 

(a) . 

(b) . 

(c) ga — a c — g 

(d) GTGCAACATT TGCATTTGA- 670 
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FIGURE 2 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT 42 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe 
15 10 

CTT TGG CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT 84 
Leu Trp His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly 
15 20 25 

GAT GCC CCA TTC CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC 126 
Asp Ala Pro Phe Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser 
30 35 40 

CTA AGA GGA A6G GGC AGC ACT CTT GGT CTG GAC ATC GAG ACA 168 
Leu Arg Gly Arg Gly Ser Thr Leu Gly Leu Asp lie Glu Thr 
45 50 55 

GCC ACA CGT GCT GGA AAG CAG ATA GTG GAG CGG ATT CTG AAA 210 
Ala Thr Arg Ala Gly Lys Gin He Val Glu Arg He Leu Lys 
60 65 70 

GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC ATG GGC GCC CAT 252 
Glu Glu Ser Asp Glu Ala Leu Lys Met Thr Met Gly Ala His 
75 80 

ATG GGC ATA TTC GGC GCA ATA GCA GGT TTC ATA GAA AAT GGT 294 
Met Gly He Phe Gly Ala He Ala Gly Phe He Glu Asn Gly 
85 90 95 

TGG GAG GGA ATG ATA GAC GGT TGG TAC GGT TTC AGG CAT CAA 336 
Trp Glu Gly Met He Asp Gly Trp Tyr Gly Phe Arg His Gin 
100 105 110 

AAT TCT GAG GGC ACA GGA CAA GCA GCA GAT CTT AAA AGC ACT 378 
Asn Ser Glu Gly Thr Gly Gin Ala Ala Asp Leu Lys Ser Thr 
115 120 125 

CAA GCA GCC ATC GAC CAA ATC AAT GGG AAA CTG AAT AGG GTA 420 
Gin Ala Ala He Asp Gin He Asn Gly Lys Leu Asn Arg Val 
130 135 140 

ATC GAG AAG ACG AAC GAG AAA TTC qAT CAA ATC GAA AAG GAA 462 
He Glu Lys Thr Asn Glu Lys Phe His Gin He Glu Lys Glu 
145 150 

TTC TCA GAA GTA GAA GGG AGA. ATT CAG GAC CTC GAG AAA TAC 504 

Phe Ser Glu Val Glu Gly Arg He Gin Asp Leu Glu Lys Tyr 

155 160 . 165 ^ ^ 

GTT GAA GAC ACT AAA ATA GAT CTC TGG TCT TAC AAT GCG GAG 546 
Val Glu Asp Thr Lys He Asp Leu Trp Ser Tyr Asn Ala Glu 
170 175 180 
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FIGURE 2 (cont'd) 



CTT CTT GTC OCT CTG GAG AAC CAA CAT ACA ATT GAT CTG ACT 588 
Leu Leu val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr 
185 190 195 

C T C T 

GAC TCG GAA ATG AAC AAA CTG TTT GAA AAA ACA AGG AGG CAA 630 
Asp Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin 
200 205 210 

C T 

CTG AGG GAA AAT GCT GAG GAC ATG GGC AAT GGT TGC TTC AAA 672 
Leu Arg Glu Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys 
215 220 

ATA TAC CAC AAA TGT GAC AAT GCT TGC ATA GGG TCA ATC AGA 714 
He Tyr His Lys Cys Asp Asn Ala Cys He Gly Ser He Arg 
225 230 235 

AAT GGG ACT TAT GAC CAT GAT GTA TAC AGA GAC GAA GCA TTA 756 
Asn Gly Thr Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu 
240 245 250 

AAC AAC C6G TTT CAG ATC AAA GGT GTT GAA CTG AAG TCA GGA 798 
Asn Asn Arg Phe Gin He Lys Gly Val Glu Leu Lys Ser Gly 
255 260 265 

TAC AAA GAC TGG ATC CTG TGG ATT TCC TTT GCC ATA TCA TGC 840 
Tyr Lys Asp Trp He Leu Trp He Ser Phe Ala He Ser Cys 
270 275 280 

TTT TTG CTT TGT GTT GTT TTG CTG GGG TTC ATC ATG TGG GCC 882 
Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met Trp Ala 
285 290 

TGC CAA AAA GGC AAC ATT AGG TGC AAC ATT TGC ATT 91 8 

Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 
295 300 305 
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FIGURE 3 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT 45 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu 
15 10 15 

TGG CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC 90 
Trp His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala 
20 25 30 

CCA TTC CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA 135 
Pro Phe Leu Asp' Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly 
35 40 45 

AGG GGG AGC ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT 100 
Arg Gly Ser Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala 
50 55 60 

GGA AAG CAG ATA GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG 225 
Gly Lys Gin He Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu 
65 70 75 

GCA CTT AAA ATG ACC ATG GAT CAT ATG TTA ATT CAG GAC CTC GAG 270 
Ala Leu Lys Met Thr Met Asp His Met Leu He Gin Asp Leu Glu 
80 85 90 

AAA TAC GTT GAA GAC ACT AAA ATA GAT CTC TGG TCT TAC AAT GCG 315 
Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp Ser Tyr Asn Ala 
95 100 105 

GAG CTT CTT GTC GCT CTG GAG AAC CAA CAT ACA ATT GAT CTG ACT 360 
Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr 
110 115 120 

GAC TCG GAA ATG AAC AAA CTG TTT GAA AAA ACA AGG AGG CAA CTG 405 
Asp Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu 
125 130 135 

AGG GAA AAT GCT GAG GAC ATG GGC AAT GGT TGC TTC AAA ATA TAC 450 
Arg Glu Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr 
140 . 145 150 

CAC AAA TGT GAC AAT GCT TGC ATA GGG TCA ATC AGA AAT GGG ACT 495 
His Lys Cys Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thr 
155 160 165 

TAT GAC CAT GAT GTA TAC AGA GAC GAA GCA TTA AAC AAC CGG TTT 540 
Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe 
170 175 180 

CAG ATC AAA GGT GTT GAA CTG AAG TCA GGA TAC AAA GAC TGG ATC 585 
Gin He Lys Gly Val Glu Leu Lys Ser Gly Tyr Lys Asp Trp He 
185 190 195 
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FIGURE 3 (cont'd) 

CTG TGG ATT TCC TTT GCC ATA TCA TGC TTT TTG CTT TGT GTT GTT 630 

Leu Trp lie Ser Phe Ala lie Ser Cys Phe Leu Leu Cys Val Val 

200 205 210 

TTG CTG GGG TTC ATC ATG TGG GCC TGC CAA AAA GGC AAC ATT AGG 675 

Leu Leu Gly Phe lie Met Trp Ala Cys Gin Lys Gly Asn lie Arg 

215 220 225 

TGC AAC ATT TGC ATT 690 
Cys Asn lie Cys lie 
230 
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FIGURE 4 



ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TCC TTT CTT 45 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Ser Phe Leu 
IS 10 15 

TGG CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC 90 
Trp His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala 
20 25 30 

CCA TTC CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC ATG CAT GGA 135 
Pro Phe Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Met His Gly 
35 40 45 

TCA TAT GTT AAC AAG ACA CAA GAA GCT ATA AAC AAG ATA ACA AAA 180 
Ser Tyr Val Asn Lys Thr Gin Glu Ala He Asn Lys He Thr Lys 
50 55 60 

AAT CTC AAC TAT TTA AGT GAG CTA GAA GTA AAA AAC CTT CAA AGA 225 
Asn Leu Asn Tyr Leu Ser Glu Leu Glu Val Lys Asn Leu Gin Arg 
65 70 75 

CTA AGC GGA GCA ATG AAT, GAG CTT CAC GAC GAA ATA CTC GAG CTA 270 
Leu Ser Gly Ala Met Asn Glii Leu His Asp Glu He Leu Glu Leu 
80 85 90 

GAC GAA AAA GTG GAT GAT CTA AGA GCT GAT ACA ATA AGC TCA CAA 315 
Asp Glu Lys Val Asp Asp Leu Arg Ala Asp Thr He Ser Ser Gin 
95 . 100 105 

ATA GAG CTT GCA GTC TTG CTT TCC AAC GAA GGG ATA ATA AAC AGT 360 
He Glu Leu Ala Val Leu Leu Ser Asn Glu Gly He He Asn Ser 
110 115 . 120 

GAA GAT GAG CAT CTC TTG GCA CTT GAA AGA AAA CTG AAG AAA ATG 405 
Glu Asp Glu His Leu Leu Ala Leu Glu Arg Lys Leu Lys Lys Met ' 
125 130 135 

CTT GGC CCC TCT GCT GTA GAA ATA GGGAAT GGG TGC TTT GAA ACC 450 
Leu Gly Pro Ser Ala Val Glu lie Gly Asn Gly Cys Phe Glu Thr 
140 145 150 

AAA CAC AAA TGC AAC CAG ACT TGC CTA GAC AGG ATA GCT GCT GGC 495 
Lys His Lys Cys Asn Gin Thr Cys Leu Asp Arg He Ala Ala Gly 
155 160 165 

ACC TTT AAT GCA GGA GAT TTT TCT CTT CCC ACT TTT GAT TCA TTA 540 
Thr Phe Asn Ala Gly Asp Phe Ser Leu Pro Thr Phe Asp Ser Leu 
170 175 180 

AAC ATT ACT GCT GCA TCT TTA AAT GAT GAT GGC TTG GAT AAT CAT 585 
Asn He Thr Ala Ala Ser Leu Asn Asp Asp Gly Leu Asp Asn His 
185 190 195 
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FIGURE 4 (cont'd) 

ACT ATA CTG CTC TAC TAC TCA ACT GOT GCT TCT AGC TTG GCT GTA 630 
Thr lie Leu Leu Tyr Tyr Ser Thr Ala Ala Ser Ser Leu Ala Val 
200 205 210 

ACA TTA ATG ATA GCT ATC TTC ATT GTC TAC ATG GTC TCC AGA GAC 675 
Thr Leu Met lie Ala lie Phe lie Val Tyr Met Val Ser Arg Asp 
215 220 225 

AAT GTT TCT TGT TCC ATC TGT CTG 
Asn Val Ser Cys Ser lie Cys Leu 
230 
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AATTCTCATG TTTGACAGCT TATCATCGAT 
TCGCAGATGA CGAATCATTG GGATTCCCAT 
GTTTTGCCAG AACTGTTTTC GGGCCGACCA 
AAGGAATGTC ATTAAGCAGT GGAGAGCGCG 
CCAAAGACGA TGTTATGCAT ATGCATTTCA 
AGCTACCTGT ATTTGGCGAT ATATTAAAGG 
TTTTTGACTC ATTGGATATA GTCATTAAGC 
CCAAGGATAT TATTTTTAAC CCGTCACCTC 
ATGAGGCCGG AGATATTTTA ACAGAACATT 
CTCTGAACAA GGTCACCAAT GCTGAGATAG 
TGAAAAGTGA TATACTGGAA TGTTTTAAAA 
CAGGAGTAAT TATGCGGAAC AGAATCATGC 
TTATCGTAAG CATTTGCTAT CTCCTTTTCC 
CTCATAGAGA TGGTCTTGGG GCGACATTGT 
TGATTGCTGC CTTGACGTTT CTAATCGGAA 
A6TATGGGTA TATGACATCG GTAGTTATTG 
CTTTGTTTTT CTGCGGGTTA TTGCTTCTTT 
TCGCCATCGG CATTGCCTCT GCATCGTTCA 
ATAATTTGAC CAGAGAACAA GAATAACCCG 
ATCGCCCCCA AAACACATAA CCAATTGTAT 
AAACATAGCA ATTCAGATCT CTCACCTACC 
ATATAAAAAA CATACAGATA ACCATCTGCG 
AAATACCACT GGCGGTGATA CTGAGCACAT 
ACGCTCTTAA AAATTAAGCC CTGAAGAAGG 
TGTGATACGA AACGAAGCAT TGGCCGTAA6 
ATCGCGGGGG GTTTTCGTTC AGGACTACAA 
GAGAATCCAG ATGGATGCAC AAACACGCCG 



:gure 5 

aagcttcagt tgaagatatt aagaacagcc 60 
cttttttgtt tgttgaaggc gacaccattg 120 
catccgatct gacagatttt ttaatcggga 180 
ttcagataga gccactgatg aggggaacca 240 
tcggccgaac aacggtgaag gtagaagcca 300 
tct7aggggc aacagatatt gaaggggagc 360 
caaaa7ttaa aagggatata aaaaaggttg 420 
aattttcaga cattagcctg cgggcaaaag 4 90 
atctatcaga aaaaggccat ctctcagcgc 540 
ctgaagagat ggcatattgc tacgcaagaa 600 
ggcaggtggg caaagttaag gattaattat 660 
ctggtgttta catagtaata attccttacg 720 

GCCACTACAT TCCTGGTGTT TCTTTTTCAG 780 
CATCATATGC AGGAACCATG ATTGCAATCC 840 
GCAGAACGCG CCGACTGGCC AAGATTAGAG 900 
TCTATGCCCT TAGTTTTGTT GAGCTTGGAG 960 
CCAGCATAAG CGGCTACATG ATACCCACTA 1020 
TTCATATATG CATCCTTGTT TTCCAACTAT 1080 
GCCTCAGCGC CGGGTTTTCT TTGCCTCACG 1140 
TTATTGAAAA ATAAATAGAT ACAACTCACT 1200 
AAACAATGCC CCCCTGCAAA AAATAAATTC 1260 
GTGATAAATT ATCTCTGGCG GTGTTGACAT 1320 
CAGCAGGACG CACTGACCAC CATGAAGGTG 1380 
GCAGCATTCA AAGCAGAAGG CTTTGGGGTG 1440 
TGCGATTCCG GATTAGCTGC CAATGTGCCA 1500 
CTGCCACACA CCACCAAAGC TAACTGACAG 1560 
CCGCGAACGT CGCGCAGAGA AACAGGCTCA 1620 
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FIGURE 5 (cont'd) 



ATGGAAAGCA GCAAATCCCC TGTTGGTTGG GGTAAGCGCA AAACCAGTTC CGAAAGATTT 1680 

TTTTAACTAT AAACGCTGAT GGAAGCGTTT ATGCGGAAGA GGTAAAGCCC TTCCCGAGTA 1740 

ACAAAAAAAC AACAGCATAA ATAACCCCGC TCTTACACAT TCCAGCCCTG AAAAAGGGCA 1800 

TCAAATTAAA CCACACCTAT GGTGTATGCA TTTATTTGCA TACATTCAAT CAATTGTTAT 1860 

CTAAGGAAAT ACTTACAT ATG GAT CCA AAC ACT GIG' TCA AGC TTT CAG GTA 1911 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val 
1 5 10 

GAT TGC TTT CTT TGG CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA 1959 
Asp Cys Phe Leu Trp His Veri Arg Lys Arg Val Ala Asp Gin Glu Leu 
15 . 20 25 

GGT GAT GCC CCA TTC CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA 2007 
Gly Asp Ala Pro Phe Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu 

30 . 35 40 

AGA GGA AGG GGC AGC ACC CTC GGT CTG GAC ATC GAG ACA GCC ACA CGT 2055 
Arg Gly Arg Gly Ser Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arcj 
45 -50 



55 



GCT GGA AAG CAG ATA GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG 2103 
Ala Gly Lys Gin He Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu 
60 65 70 75 

GCA CTT AAA ATG ACC ATG GGT TTC TTC GGA GCT ATT GCT GGT TTC TTG 2151 
Ala Leu Lys Met Thr Met Gly Phe Phe Gly Ala He Ala Gly Phe Leu 
80 . 85 . 90 

A A A ATG 

GAA GGT GGT TGG GAA GGT CTC ATT GCA GGT TGG CAC GGA TAC ACA TCT 2199 
Glu Gly Gly Trp Glu Gly Leu He Ala Gly Trp His Gly Tyr Thr Ser 
95 Met 100 105 

CAT GGA GCA CAT GGA GTG GCA GTG GCA GCA GAC CTT AAG AGT ACA CAA 2247 
His Gly Ala His Gly Val Ala Val Ala Ala Asp Leu Lys Ser Thr Gin 
110 115 120 

GAA GCT ATA AAC AAG ATA ACA AAA AAT CTC AAC TAT TTA AGT GAG CTA 
Glu Ala He Asn Lys He Thr Lys Asn Leu Asn Tyr Leu Ser Glu Leu 
125 130 135 



2295 



GAA GTA AAA AAC CTT CAA AGA CTA AGC GGA GCA ATG AAT GAG CTT CAC 2343 
Glu Val Lys Asn Leu Gin Arg Leu Ser Gly Ala Met Asn Glu Leu His 
"0 145 150 155 

GAC GAA ATA CTC GAG CTA GAC GAA AAA GTG GAT GAT CTA AGA GCT GAT 2391 
Asp Glu He Leu Glu Leu Asp Glu Lys Val Asp Asp Leu Arg Ala Asp 
160 165 170 
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FIGURE 5 (cont'd) 



G . 

ACA ATA AGC TCA CAA ATA GAG CTT GCA GTC TTG CTT TCC AAC GAA GGT 2439 
Thr lie Ser Ser Gin lie Glu Leu Ala Val Leu Leu Ser Asn Glu Gly 
175 180 185 

A A T 

ATC ATC AAC AGT GAA GAC GAG CAT CTC TTG GCA CTT GAA AGA AAA CTG 2487 
He He Asn Ser Glu Asp Glu His Leu Leu Ala Leu Glu Arg Lys Leu 
190 195 200 

C T T G 

AAG AAA ATG CTT GGC CCC TCT GCT GTA GAA ATA GGG AAC GGT TGC TTT 2535 
Lys Lys Met Leu Gly Pro Ser Ala Val Glu He Gly Asn Gly Cys Phe 
205 210 215 



GAA ACC AAA CAC AAA TGC AAC CAG ACT TGC CTA GAC AGG ATA GCT GCT 2583 
Glu Thr Lys His Lys Cys Asn Gin Thr Cys Leu Asp Arg He Ala Ala 
220 225 230 235 

GGC ACC TTT AAT GCA GGA GAT TTT TCT CTT CCC ACT TTT GAT TCA TTA 2631 
Gly Thr Phe Asn Ala Gly Asp Phe Ser Leu Pro Thr Phe Asp Ser Leu 
240 245 250 

AAC ATT ACT GCT GCA TCT TTA AAT GAT GAT GGC TTG GAT AAT CAT ACT 2679 
Asn He Thr Ala Ala Ser Leu Asn Asp Asp Gly Leu Asp Asn His Thr 
255 260 265 

ATA CTG CTC TAC TAC TCA ACT GCT GCT TCT AGC TTG GCT GTA ACA TTA . 2727 
He Leu Leu Tyr Tyr Ser Thr; Ala Ala Ser Ser Leu Ala Val Thr Leu 
270 275 280 

ATG ATA GCT ATC TTC ATT GTC TAC ATG GTC TCC AGA GAC AAT GTT TCT 2775 
Met He Ala He Phe He Val Tyr Met Val Ser Arg Asp Asn Val Ser 
285 290 295 

TGT TCC ATC TGT CTG TGAGGGAGAT TAAGCCCTGT GTTTTCCTTT ACTGTAGTGC 2830 

Cys Ser He Cys Leu 

300 

TCATTTGCTT GTCACCATTA CAAAGAAACG TTATTGAAAA ATGCTCTTGT TACTACTGAA 2890 
TTCTAGAATC GATAAGCTTC GACCGATGCC CTTGAGAGCC TTCAACCCAG TCAGCTCCTT 2950 
CCGGTGGGCG CGGGGCATGA CTATCGTCGC CGCACTTATG ACTGTCTTCT TTATCATGCA 3010 
ACTCGTAGGA CAGGTGCCGG CAGCGCTCTG GGTCATTTTC GGCGAGGACC GCTTTCGCTG 3070 
GAGCGCGACG ATGATCGGCC TGTCGCTTGC GGTATTCGGA ATCTTGCACG CCCTCGCTCA 3130 
AGCCTTCGTC ACTGGTCCCG CCACCAAACG TTTCGGCGAG AAGCAGGCCA TTATCGCCGG 3190 
CATGGCGGCC GACGCGCTGG GCTACGTCTT GCTGGCGTTC GTCCAGTAAT GACCTCAGAA 3250 
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FIGURE 5 (cont'd) 

CTCCATCTGG ATTTGTTCAG AACGCTCGGT TGCCGCCGGG CGTTTTTTAT TGGTGAGAAT 3310 
CGCAGCAACT TGTCGC6CCA ATCGAGCCAT GTCGTCGTCA ACGACCCCCC ATTCAAGAAC 3370 
AGCAAGCAGC ATTGAGJVACT TTGGAATCCA GTCCCTCTTC CACCTGCTGA GACGCGAGGC 3430 
TGGATGGCCT TCCCCATTAT GATTCTTCTC GCTTCCGGCG GCATCGGGAT GCCCGCGTTG 3490 
CAGGCCATGC TGTCCAGGCA GGTAGATGAC GACCATCAGG GACAGCTTCA AGGATC6CTC 3550 
GCGGCTCTTA CCAGCCTAAC TTCGATCACT GGACCGCTGA TCGTCACGGC GATTTATGCC 3610 
GCCTCGGCGA GCACATGGAA CGGGTTGGCA TGGATTGTAG GCGCCGCCCT ATACCTTGTC 3670 
TGCCTCCCCG CGTTGCGTCG CGGTGCATGG AGCCGGGCCA CCTCGACCTG AATGGAAGCC 3730 
GGCGGCACCT CGCTAACGGA TTCACCACTC CAAGAATTGG AGCCAATCAA TTCTTGCGGA 3790 
GAACTGTGAA TGCGCAAACC AACCCTTGGC AGAACATATC CATCGCGTCC GCCATCTCCA 3850 
GCAGCCGCAC GCGGCGCATC TCGGGCAGCG TTGGGTCCTG GCCACGGGTG CGCATGATCG 3910 
TGCTCCTGTC GTTGAGGACC CGGCTAG6CT GGCGGGGTTG CCTTACTGGT TAGCAGAATG 3970 
AATCACCGAT ACGCGAGCGA ACGTGAAGCG ACTGCTGCTG CAAAACGTCT GCGACCTGAG 4030 
CAACAACATG AATGGTCTTC GGTTTCCGTG TTTCGTAAAG TCTGGAAACG CGGAAGTCAG 4090 
CGCCCTGCAC CATTATGTTC CGGATCTGCA TCGCAGGATG CTGCTGGCTA CCCTGTGGAA 4150 
CACCTACATC TGTATTAACG AAGCGCTGGC ATTGACCCTG AGTGATTTTT CTCTGGTCCC 4210 
GCCGCATCCA TACCGCCAGT TGTTTACCCT CACAACGTTC CAGTAACCGG GCATGTTCAT 4270 
CATCAGTAAC CCGTATCGTG AGCATCCTCT CTCGTTTCAT CGGTATCATT ACCCCCATGA 4330 
ACAGAAATTC CCCCTTACAC GGAGGCATCA AGTGACCAAA CAGGAAAAAA CCGCCCTTAA 4390 
CATGGCCCGC TTTATCAGAA GCCAGACATT AACGCTTCTG GAGAAACTCA ACGAGCTGGA 4450 
CGCGGATGAA CAGGCAGACA TCTGTGAATC GCTTCACGAC CACGCTGATG AGCTTTACCG 4510 
CAGCTGCCTC GCGCGTTTCG 6TGATGACGG TGAAAACCTC TGACACATGC AGCTCCCGGA 4570 
GACGGTCACA GCTTGTCTGT AAGCGGATGC CGGGAGCAGA CAAGCCCGTC AGGGCGCGTC 4630 
AGCGGGTGTT GGCGGGTGTC GGGGCGCAGC CATGACCCAG TCACGTAGCG ATAGCGGAGT 4690 
GTATACTGGC TTAACTATGC GGCATCAGAG CAGATTGTAC TGAGAGTGCA CCATATGCGG 4750 
TGTGAAATAC CGCACAGATG CGTAAGGAGA AAATACCGCA TCAGGCGCTC TTCCGCTTCC 4810 
TCGCTCACTG ACTCGCTGCG CTCGGTCGTT CGGCTGCGGC GAGCGGTATC AGCTCACTCA 4870 
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FIGURE 

AAGGCGGTAA TACGGTTATC CACAGAATCA 
AAAGGCCAGC AAAAGGCCAG GAACCGTAAA 
CTCCGCCCCC CTGACGAGCA TCACAAAAAT 
ACAGGACTAT AAAGATACCA GGCGTTTCCC 
CCGACCCTGC CGCTTACCGG ATACCTGTCC 
TCTCAATGCT CACGCTGTAG GTATCTCAGT 
TGTGTGCACG AACCCCCCGT TCAGCCCGAC 
GAGTCCAACC CGGTAAGACA CGACTTATCG 
AGCAGAGCGA GGTATGTAGG CGGTGCTACA 
TACACTAGTU^ GGACAGTATT TGGTATCTGC 
AGAGTTGGTA GCTCTTGATC CGGCAAACAA 
TGCAAGCAGC AGATTACGCG CAGAAAAAAA 
ACGGGGTCTG ACGCTCAGTG GAACGAAAAC 
TCAAAAAGGA TCTTCACCTA GATCCTTTTA 
AGTATATATG AGTAAACTTG GTCTGACAGT 
TCAGCGATCT GTCTATTTCG TTCATCCATA 
ACGATACGGG AGGGCTTACC ATCTGGCCCC 
TCACCGGCTC CAGATTTATC AGCAATAAAC 
GGTCCTGCAA CTTTATCCGC CTCCATCCAG 
AGTAGTTCGC CAGTTAATAG TTTGCGCAAC 
TCACGCTCGT CGTTTGGTAT GGCTTCATTC 
ACATGATCCC . CCATGTTGTG CAAAAAAGCG 
A6AAGTAAGT TGGCCGCAGT GTTATCACTC 
ACTGTCATGC CATCC6TAAG ATGCTTTTCT 
AGCACTCAGG GCGCAAGGGC TGCTAAAGGA 
AACGGTGCTG ACCCCGGATG AATGTCAGCT 
GCGCAAAGAG AAAGCAGGTA GCTTGCAGTG 



PCTAJS94/01149 

5 (cont'd) 
GGGGATAACG CAGGAAAGAA CATGTGAGCA 4930 
AAGGCCGCGT TGCTGGCGTT TTTCCATAGG 4990 
CGACGCTCAA GTCAGAGGTG GC6AAACCCG 5050 
CCTGGAAGCT CCCTCGTGCG CTCTCCTGTT 5110 
GCCTTTCTCC CTTCGGGAAG CGTGGCGCTT 5170 
TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC 5230 
CGCTGCGCCT TATCCGGTAA CTATCGTCTT 5290 
CCACTGGCAG CAGCCACTGG TAACAGGATT 5350 
GAGTTCTTGA AGTGGTGGCC TAACTACGGC 5410 
GCTCTGCTGA AGCCA6TTAC CTTCGGAAAA 5470 
ACCACCGCTG GTAGCGGTGG TTTTTTTGTT 5530 
GGATCTCAAG AAGATCCTTT GATCTTTTCT 5590 
TCACGTTAAG GGATTTTGGT CATGAGATTA 5650 
AATTAAAAAT GAAGTTTTAA ATCAATCTAA 5710 
TACCAATGCT TAATCAGTGA GGCACCTATC 5770 
GTTGCCTGAC TCCCCGTCGT GTAGATAACT 5830 
AGTGCTGCAA TGATACCGCG AGACCCACGC 5890 
CAGCCAGCCG GAAGGGCCGA GCGCAGAAGT 5950 
TCTATTAATT GTTGCCGGGA AGCTAGAGTA 6010 
GTTGTTGCCA TTGCTGCA6G CATCGTGGTG 6070 
AGCTCCGGTT CCCAACGATC AAGGCGAGTT 6130 
GTTAGCTCCT TCGGTCCTCC GATCGTTGTC 6190 
ATGGTTATGG CAGCACTGCA TAATTCTCTT 6250 
GTGACTGGTG AGTAGCTTCA CGCTGCCGCA 6310 
AGCGGAACAC GTAGAAAGCC AGTCCGCAGA 6370 
ACTGGGCTAT CTGGACAAGG GAAAACGCAA 6430 
GGCTTACATG GCGATAGCTA GACTGGGCGG 6490 
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FIGURE 

TTTTATGGAC AGCAAGCGAA CCGGAATTGC 
AGCCCTGCAA AGTAAACTGG ATGGCTTTCT 
CAAGATCTGA TCAAGAGACA GGATGAGGAT 
ACGCAGGTTC TCCGGCCGCT TGGGTGGAGA 
CAATCGGCTG CTCTGATGCC GCCGTGTTCC 
. TTGTCAAGAC CGACCTGTCC GGTGCCCTGA 
CGTGGCTGGC CACGACGGGC GTTCCTTGCG 
GAAGGGACTG GCTGCTATTG GGCGAAGTGC 
CTCCTGCCGA GAAAGTATCC ATCATGGCTG 
CGGCTACCTG CCCATTCGAC CACCAAGCGA 
TGGAAGCCGG TCTTGTCGAT CAGGATGATC 
CCGAACTGTT CGCCAGGCTC AAGGCGCGCA 
ATGGCGATGC CTGCTTGCCG AATATCATGG 
ACTGTGGCCG GCTGGGTGTG GCGGACCGCT 
TTGCTGAAGA GCTTGGCGGC GAATGGGCTG 
CTCCCGATTC GCAGCGCATC GCCTTCTATC 
TCTGGGGTTC GAAATGACCG ACCAAGC6AC 
CACCGCCGCC TTCTATGAAA GGTTGGGCTT 
GATCCTCCAG CGCGGGGATC TCATGCTGGA 



PCT/US94/01149 

5 (cont'd) 
CAGCTGGGGC GCCCTCTGGT AAGGTTGGGA 6550 
TGCCGCCAAG GATCTGATGG CGCAGGGGAT 6610 
CGTTTCGCAT GATTGAACAA GATGGATTGC 6670 
GGCTATTCGG CTATGACTGG GCACAACAGA 6730 
GGCTGTCAGC GCAGGGGCGC CCGGTTCTTT 6790 
ATGAACTGCA GGACGAGGCA GCGCGGCTAT 6850 
CAGCTGTGCT CGACGTTGTC ACTGAAGCGG 6910 
CGGGGCAGGA TCTCCTGTCA TCTCACCTTG 6970 
ATGCAATGCG GCGGCTGCAT ACGCTTGATC 7030 
AACATCGCAT CGAGCGAGCA CGTACTCGGA 7090 
TGGACGAAGA GCATCAGGGG CTCGCGCCAG 7150 
TGCCCGACGG CGAGGATCTC GTCGTGACTC 7210 
TGGAAAATGG CCGCTTTTCT GGATTCATCG 7270 
ATCAGGACAT AGCGTTGGCT ACCCGTGATA 7330 
ACCGCTTCCT CGTGCTTTAC GGTATCGCCG 7390 
GCCTTCTTGA CGAGTTCTTC TGAGCGGGAC 7450 
GCCCAACCTG CCATCACGAG ATTTCGATTC 7510 
CGGAATCGTT TTCCGGGACG CCGGCTGGAT 7570 
GTTCTTCGCC CACCCC 7616 
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NSl 

ATGGATCCAAACACTGTGTCAAGCTTTCAGGTAGATTGCTTTCTTTGGCATGTCCGCAAA. 

+ + + + + + 

TACCTAGGTTTGTGACAGAGTTCGAAAGTCCATCTAACGAAAGAAACCGTACAGGCGTTT 

1 

MetAspProAsnThrValSerSerPheGlnValAspCysPheLeuTrpHisValArgLys 

CGAGTTGCAGACCAAGAACTAGGTGATGCCCCATTCCTTGATCGGCTTCGCCGAGATCAG 
+ + + + + + 

GCTCAACGTCTGGTTCTTGATCCACTACGGGGTAAGGAACTAGCCGAAGCGGCTCTAGTC 

ArgValAlaAspGlnGluLeuGlyAspAlaProPheLeuAspArgLeuArgArgAspGln 

AAATCCCTAAGAGGAAGGGGCAGCACTCTTGGTCTGGACATCGAGACAGCCACACGTGCT 
+ + + + — + + 

TTTAGGGATTCTCCTTCCCCGTCGTGAGAACCAGACCTGTAGCTCTGTCGGTGTGCACGA 

LysSerLeuArgGlyArgGlySerThrLeuGlyLeuAspIleGluThrAlaThrArgAla 

GGAAAGCAGATAGTGGAGCGGATTCTGAAAGAA6AATCCGATGAGGCACTTAAAATGACC 
+ + + + + + 

CCTTTCGTCTATCACCTCGCCTAAGACTTTCTTCTTAGGCTACTCCGTGAATTTTACTGG 
GlyLysGlnlleValGluArglleLeuLysGluGluSerAspGluAlaLeuLysMetThr 
HA2 

ATGCAGATCCCGGCTGTGGGTAAAGAiVTTCAACAAATTAGAAAAAAGGATGGAAAATTTA 
+ +-.^ + . +- + + 

TACGTCTAGGGCCGACACCCATTTCTTAAGTTGTTTAATCTTTTTTCCTACCTTTTAAAT 
81 linXex 65 69 

MetGlnlleProAlaValGlyLysGluPheAsnLysLeuGluLysArgMetGluAsnLeu 

AATAAAAAA6TTGATGATGGATTTCTGGACATTTGGACATATAATGCAGAATTGTTAGTT 
+ + + + + + 

TTATTTTTTCAACTACTACCTAAAGACCTGTAAACCTGTATATTACGTCTTAACAATCAA • 
81 

AsnLysLysValAspAspGlyPheLeuAspIleTrpThrTyrAsnAlaGluLeuLeuVal 

CTACTGGAAAATGAAAGGACTCTGGATTTCCATGACTCAAATGTGAAGAATCTGTATGAG 
+ + + + + 

GATGACCTTTTACTTTCCTGAGACCTAAAGGTACTGAGTTTACACTTCTTAGACATACTC 
LeuLeuGluAsnGluArgThrLeuAspPheHisAspSerAsnValLysAsnLeuTyrGlu 
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Figure 6 (Conf dl.) 



AAAGTAAAAAGCCAATTAAAGAATAATGCC2U^GAAATCGGAAATGGATGTTTTGAGTTC 
+ + + + + + 

TTTCATTTTTCGGTTAATTTCTTATTACGGTTTCTTTAGCCTTTACCTACAAAACTCAAG 

LysValLysSerGlnLeuLysAsnAsnAlaLysGluIleGlyAsnGlyCysPheGluPhe 

TACCACAAGTGTGACAATGAATGCATGGAAAGTGTAAGAAATGGGACTTATGATTATCCC 
+ + + + + + 

ATGGTGTTCACACTGTTACTTACGTACCTTTCACATTCTTTACCCTGAATACTAATAGGG 

150 

TyrHisLysCysAspAsnGluCysMetGluSerValArgAsnGlyThrTyrAspTyrPro 

AAATATTCAGAAGAGTCAAAGTTGAACAGGGAAAAGGTAGATGGAGTGAAATTGGAATCA 
+ + + ^ + + 

TTTATAAGTCTTCTCAGTTTCAACTTGTCCCTTTTCCATCTACCTCACTTTAACCTTAGT 

LysTyrSerGluGluSerLysLeuAsnArgGluLysValAspGlyValLysLeuGluSer 

ATGGGGATCTATCAGATTCTGGCGATCTACTCAACTGTCGCCAGTTCACTGGTGCTTTTG 
+ + + + + + 

TACCCCTAGATAGTCTAAGACCGCTAGATGAGTTGACAGCGGTCAAGTGACCACGAAAAC 

MetGlylleTyrGlnlleLeuAlalleTyrSerThrValAlaSerSerLeuValLeuLeu 

GTCTCCCTGGGGGCAATCAGTTTCTGGATGTGTTCTAATGGATCTTTGCAGTGCAGAATA 
+ + + + + + 

CAGAGGGACCCCCGTTAGTCAAAGACCTACACAAGATTACCTAGAAACGTCACGTCTTAT 

ValSerLeuGlyAlalleSerPheTrpMetCysSerAsnGlySerLeuGlnCysArglle 

TGCATCTGA 

ACGTAGACT 

222 
Cyslle 
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FIGURE 7 
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FIGURE 7 (cont'd) 



AAA GAC TGG ATC CTG TGG ATT TCC TTT GCC ATA TCA TGC 
Lys Asp Trp He Leu Trp He Ser Phe Ala He Ser Cys 
185 190 195 

TTT TTG CTT TGT GTT GTT TTG CTG GGG TTC ATC ATG TGG 
Phe Leu Leu Cys Val Val Leu Leu Gly Phe lie Met Trp 
200 205 



GCC TGC CAG AAA GGC AAC ATT AGG TGC AAC ATT TGC ATT 
Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 
210 215 220 



TGA 
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FIGURE 8 

GGC ATA TTC GGC GCA ATA GCA GGT TTC ATA GAA AAT GGT 39 
Gly lie Phe Gly Ala He Ala Gly Phe He Glu Asn Gly 
1 5 10 

TGG GAG GGA ATG ATA GAC GGT TGG TAG GGT TTC AGG CAT 78 
Trp Glu Gly Met He Asp Gly Trp Tyr Gly Phe Arg His 
15 20 25 

CAA AAT TCC GAG GGC AC A GGA CAA GCA GCA GAT CTT AAA 117 
Gin Asn Ser Glu Gly Thr Gly Gin Ala Ala Asp Leu Lys 
30 35 

AGC ACT CAA GCA GCC ATC GAC CAA ATC AAT GGG AAA CTG 156 
Ser Thr Gin Ala Ala lie Asp Gin He Asn Gly Lys Leu 
40 45 50 

AAT AGG GTA ATC GAG AAG ACG AAC GAG AAA TTC CAT CAA 195 
Asn Arg Val He Glu Lys Thr Asn Glu Lys Phe His Gin 
55 60 65 

ATC GAA AAG GAA TTC TCA GAA GTA GAA GGG AGA ATT CAG 234 
He Glu Lys Glu Phe Ser Glu Val Glu Gly Arg He Gin 
70 75 

GAC CTC GAG AAA TAG GTT GAA GAC ACT AAA ATA GAT CTC 273 
Asp Leu Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu 
80 B:- 90 

TGG TCT TAC AAT GCG GAG CTT CTT GTC GCT CTG GAG AAC 312 
Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu Asn 
95 100 

CAA CAT ACA ATT GAT CTG ACT GAC TCG GAA ATG AAC AAA 351 
Gin His Thr He Asp Leu Thr Asp Ser Glu Met Asn Lys 
105 110 115 

CTG TTT GAA AAA ACA AGG AGG CAA CTG AGG GAA AAT GCT 390 
Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu Asn Ala 
120 125 130 

GAG GAC ATG GGC AAT GGT TGC TTC AAA ATA TAC CAC AAA .429 
Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys 
135 140 

TGT GAC AAT GCT TGC ATA GGG TCA ATC AGA AAT GGG ACT 468 
Cys Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thr 
145 150 155 

TAT GAC CAT GAT GTA TAC AGA GAC GAA GCA TTA AAC AAC 507 
Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu Asn Asn 
160 165 



19/20 



FIGURE 8 (cont'd) 



CGG TTT CAG ATC AAA GOT GTT GAA CTG AAG TCA GGA TAG 
Arg Phe Gin lie Lys Gly Val Glu Leu Lys Ser Gly Tyr 
170 175 180 

AAA GAC TGG ATC CTG TGG ATT TCC TTT GCC ATA TCA TGC 
Lys Asp Trp lie Leu Trp He Ser Phe Ala He Ser Cys 
185 190 195 

TTT TTG CTT TGT GTT GTT TTG CTG GGG TTC ATC ATG TGG 
Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met Trp 
200 205 

GCC TGC CAA AAA GGC AAC ATT AGG TGC AAC ATT TGC ATT 
Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 
210 215 220 



TGA 
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Box I Obserrations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 
This intemational iqwtt has not been established in respect of certain claims under Aiticie 17(2Xa) for the following reasons: 
1. I I Claims Noa.: 

* — ' becausethcyielatetosubjectmatternotrequiredtobeseaichcdbythis Authority, namely: 



2. rn Claims Nos.: 

^— ' because Ihey relate to paits of the intemational application that do not comply with the prescribed requirements to such 
an extent that no meaningful international search can be carried out, specifically: 



3, |_| Claims Nos.: 

because they are dependent claims and aie not diafted in accordance witfi the second and third sentences of Rule 6.4(a). 



Box II Ofascrratbns where imily of invcBtioa » lacking (Coatimtation of Item 2 of first sheet) 



This International Searching Authority found muhiple inventions in this intemational applicatbn, as follows: 
Please See Exm Sheet. 



1 . I I As all required additional search foes were thnely paid by the applicant, this intemational search report covers all searchable 

daims. 

2. I I As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 

of any additional fee. 

3. 1^ As only some of the required additional search fees were timely paid by the applicant, this intemational search report covers 

only those ehums for which fees were paid, qwcifically chums Nos.: 



Fx] No required additional search fiees were timely paid by the applicant. Consequently, this intemational search report is 
restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 
1-6, lM6,21-23, 27-30 



Remark on Protest The additional search fees were accompanied by the applicant's protest. 

I I No protest accompanied the payment of additional search fees. 
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BOX n. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
This ISA found multiple inventions as follows: 

I. Claims 1-6, 11-16, 21-23 and 27-30, drawn to a vaccine containing an immunogenic fragment of Type A 

influenza virus and including one polypeptide, protein, microorganism and vector, classes 424,530 and 435, 
subclasses, 89, 350 and 69.1, 172.1, 320.1. 

. n. Claims 1-6, 11-16, 21-23 and 27-30, drawn to a vaccine containing an immunogenic fragment of Type B 

influenza virus and including one polypeptide, protein, microorganism and vector, classes 424,530 and 435, 
subclasses, 89, 350 and 69.1, 172.1, 320.1. 

m. Claims 1-5, 7, 11-15, 17, 21, 22, 25 and 27-29, dnwn to a vaccine containing an immunogenic fragment of 
Type A influenza virus and including one polypeptide, protein, microorganism and vector, classes 424,530 
and 435, subclasses, 89, 350 and 69.1, 172.1, 320.1. 

IV. Claims 1-5, 7, 11-15, 17, 21, 22, 25 and 27-29, drawn to a vaccine containing an immunogenic fragment of 
Type B influenza virus and including one polypeptide, protein, microorganism and vector, classes 424,530 and 
435, subclasses, 89, 350 and 69.1, 172.1, 320.1. 

V. Claims 1-5, 8, 11-15, 18, 21, 22, 25 and 27-29, drawn to a vaccine containing an immunogenic fragment of 
Type A influenza virus and including onepolypeptide, protein, microorganism and vector, classes 424, 530 
and 435, subclasses, 89, 350 and 69.1, 172.1, 320.1. 

VI. Claims 1-5, 8, 11-15, IB, 21, 22, 24 and 27-29 drawn to a vaccine containing an immunogenic fragment of 
Type B influenza virus and including one polypqidde, protein, microorganism and vector, classes 424,530 and 
435, subclasses, 89, 350 and 69.1, 172.1, 320.1. 

Vn. Claims 1-5, 9, 11-15, 19, 21, 22, 26-29 and 31, drawn to a vaccine containing an immunogenic fragment of 
Type A influenza virus and including one polypqitide, protein, microorganism and vector, classes 424,530 
and 435, subclasses, 89, 350 and 69.1, 172.1, 320.1. 

vni. Claims 1-5,9, 11-15, 19, 21, 22, 26-29 and 31 drawn to a vaccine containing an immunogenic fragment of 

Type B influenza virus and including one polypeptide, protein, microorganism and vector, classes 424,530 and 
435, subclasses, 89, 350 and 69.1, 172.1. 320.1. 

IX. Claims 1-5, 11-15, 20-22, 27-29 and 32, drawn to a vaccine containing an immunogenic fragment of Type A 
influenza virus and including one polypeptide, protein, microorganism and vector, classes 424,530 and 435, 
subclasses, 89. 350 and 69.1. 172.1, 320.1. 

X. Claims 1*5, 11-15. 20-22, 27'>29 and 32, drawn to a vaccine containing an immunogenic fragment of 
TypeBinfluenzavirusand including one polypeptide, protein, miccoorganism and vector, classes 424,530 and 
435, subclasses, 89, 350 and 69.1, 172.1, 320.1. 

XI. Claims 10, 33, 34 and 35, drawn to a vaccine containing two polypeptides, wherein the first polypeptide has 
SEQ ID No. 10. 

XII. Claims 33, 34 and 35, drawn to a vaccine containing two polypeptides, wherein the first polypeptide has SEQ 
ID No. 12. 

Xm. Claims 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ ID No. 16. 

XIV. Claims 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide hat: 
SEQ ID No. 18. 

XV. Claims 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ ID No. 20. 

XVI. Claims 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ ID No. 22. 
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XVn. Claifiu 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ ID No. 24. 

XVIIL Claims 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ ID No.26. 

XDC. Claims 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ ID No. 27. 

XX. Claims 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ ID No. 28. 

XXI. Claims 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has ' 
SEQ ID No. 29. 

XXII. Claims 33, 35 and 36. drawn to a vaccine containing two polypeptides, wherein the second polypqttide has 
SEQ ID No. 30. 

XXni. Claims 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ ID No. 32. 

XXIV. Claims 33, 35 and 36, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ ID No. 34. 

XXV. Claims 33, 35 and 37, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ n> No. 14. 

XXVL Claims 33, 35 and 38, drawn to a vaccine containing two polypeptides, wherein the second polypeptide has 
SEQ ID No. 57. 

XXVn. Claims 39 and 40 drawn to a vaccine containing three polypeptides. 

and it considered that the International Application did not comply with the requirements of unity of inventbn {Rules 
13.1, 13.2 and 13.3) for the reasons indicated bebw: 

The clatnu of groups I-XXVn are drawn to multiple products which are not linked by a special technical 
feature so as to form a single inventive concept. PCT Rule 13.1 and Rule 13.2 do not provide for multiple products 
and methods within a single general Inventive concept. 
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